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DESCRIPTION 

METHOD AND REAGENT FOR THE INHIBITION OF CALCIUM 
ACTIVATED CHLORIDE CHANNEL- 1 (CLCA-1) 



Background Of The Invention 

5 The present invention concerns compounds, compositions, and methods for 

the study, diagnosis, and treatment of conditions and diseases related to the 
expression of CLCA (Cl~ Channel Ca 2+ - Activated) genes. 

The following is a brief description of the current understanding of CLCAs. 
The discussion is not meant to be complete and is provided only for understanding 
1 0 the invention that follows. The summary is not an admission that any of the work 
described below is prior art to the claimed invention. 

CLCA proteins are emerging as a new class of channel proteins that mediate 
Ca 2+ -activated CI" conductance in a variety of tissues. Members of the CLCA 
family have been cloned, isolated, and partially characterized from human, bovine, 

1 5 and murine species. These proteins demonstrate a high degree of homology in their 
size, sequence, and predicted structure yet can vary considerably in tissue 
distribution. Bovine CLCA1 (bCLCAl or CaCC) was the first reported CLCA 
homolog. The bCLCAl protein, which was isolated from and is exclusively 
detected in trachial epithelial cells, functions as a Ca 2+ -activated CI* channel (Ran 

20 and Benos, 1992, J. Biol. Chem., 267, 3618-3625; Cunningham et al, 1995, J. Biol 
Chem., 270, 31016-31026). Another bovine homolog, bovine lung-endothelial cell 
adhesion molecule- 1 (Lu-ECAM-1), appears to have involvement in the preferential 
metastasis of melanoma cells to the lung. Lu-ECAM-1 shares 92% nucleotide 
identity to bCLCAl and is expressed in vascular endothelial cells (Elble et al, 1997, 

25 J. Biol. Chem., 272, 27853-27861). It has been shown that Lu-ECAM-1, can 
mediate the binding of lung-metastatic mouse B16F10 melanoma cells to endothelial 
cells (Zhu et al, 1992, J. Clin. Invest, 89, 1718-1724), however, due to sequence 
similarity to bCLCAl, the role of Lu-ECAM-1 as a chloride channel has been 
suggested (Elble et al, supra). The mouse homolog, mCLCAl, appears to have an 

30 expression pattern similar to the cystic fibrosis transmembrane conductance 
regulator (CFTR), with expression seen in various secretory epithelial cells, 
squamous epithelia, and in some lymphocytes (Gruber et al, 1998, Histochem. Cell 
Biol, 110, 43-49). 
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The three human CLCA homologs (hCLCAl, hCLCA2, and hCLCA3) thus 
far cloned, isolated, and partially characterized, all retain sequence homology, 
similar cDNA length, and are all located on the short arm of chromosome 1 (lp22- 
p31). Human CLCA proteins show a restricted pattern of expression in differing 
5 secretory tissues. Human CLCA1 was the first reported calcium activated chloride 
channel in humans. The 31,902-bp hCLCAl gene is located on chromosome lp22- 
p31, contains 14 introns, and is preceded by a canonic promoter region that contains 
an LI transposable element. Expression of hCLCAl is predominant in intestinal 
basal crypt epithelia and goblet cells. A protein processing model has been proposed 

10 for hCLCAl in which the primary translation product (125-kDa) is cleaved to a 90- 
kDa and a group of 37- to 41-kDa proteins, the latter apparently representing 
different glycosylation products of the same polypeptide (Gruber et al, 1998, 
Genomics, 54, 200-214). Transient expression of hCLCAl cDNA in HEK 293 cells 
is associated with an increase in whole-cell Ca 2+ -activated Ct conductance that is 

15 susceptible to inhibition with anion channel blocking compounds. Cell attached 
patch recordings of transfected cells in this study revealed single channels with a 
slope conductance of 13.4 pS (Gruber et al, supra). 

The hCLCA2 homolog is processed in a similar manner as is hCLCAl, 
resulting in the formation of a heterodimer consisting of a 90-kDa amino terminal 

20 and an approximately 35-kDa carboxy terminal subunit with anchorage to the 
plasma membrane via four or five transmembrane domains. Expression of hCLCA2 
is somewhat less restricted than that of hCLCAl, being expressed from human lung, 
trachea, and breast tissue (Gruber et al, 1999, Am. J. Physiol, 276, C1261-C1270). 
Human CLCA2 is expressed in normal breast epithelium but not in breast tumors of 

25 different stages of progression, suggesting that hCLCA2 may act as a tumor 
suppressor in breast cancer (Gruber et al, 1999, Cancer Res., 59, 5488-5491). 
Human CLCA3 is a truncated, secreted member of the CLCA family which is 
expressed in numerous tissues including lung, trachea, spleen, thymus, and breast 
tissue. Unlike hCLCAl and hCLCA2 which are processed into heterodimers, 

30 hCLCA3 mRNA encodes a 37-kDa glycoprotein that corresponds to the N-terminal 
extracellular domain of its homologs. When hCLCA3 is expressed in HEK 293 or 
CHO cells, the 37-kDa glycoprotein is secreted (Gruber and Pauli, 1999, Biochem. 
Biopkys. Acta, 1444,418-423). 

Holroyd et al, International PCT publication No. WO/9944620, describe a 
35 calcium-activated chloride channel that is induced by IL-9. 
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Summary Of The Invention 

The invention features novel nucleic acid-based techniques [e.g., enzymatic 
nucleic acid molecules (ribozymes), antisense nucleic acids, 2-5A antisense 
chimeras, triplex DNA, antisense nucleic acids containing RNA cleaving chemical 
5 groups] and methods for their use to modulate the expression of CLCA (CI" Channel 
Ca 2+ -Activated) genes. 

In a preferred embodiment, the invention features the use of one or more of the 
nucleic acid-based techniques independently or in combination to inhibit the 
expression of the genes encoding hCLCAl, hCLCA2, hCLCA3, and hCLCA4. 

10 Specifically, the invention features the use of nucleic acid-based techniques to 
specifically inhibit the expression of CLCA1 (GenBank accession Nos. 
NM_001285, AF039400, AF039401, AF127036), CLCA2 (GenBank accession No. 
NM_006536), CLCA3 (GenBank accession No. NM_004921), and CLCA4 
(GenBank accession No. NM_012128) genes. In yet another preferred embodiment, 

15 the invention features the inhibition of CLCA1 gene using the nucleic acid-based 
techniques of the instant invention. 

In another preferred embodiment, the invention features the use of an 
enzymatic nucleic acid molecule, preferably in the hammerhead, NCH (hiozyrne), G- 
cleaver, amberzyme, zinzyme and/or DNAzyme motif, to inhibit the expression of 
20 CLCA genes. 

By "inhibit" it is meant that the activity of CLCA1 or level of RNAs or 
equivalent RNAs encoding one or more protein subunits of CLCA1 is reduced 
below that observed in the absence of the nucleic acid molecules of the invention. In 
one embodiment, inhibition with enzymatic nucleic acid molecules preferably is 

25 below that level observed in the presence of an enzymatically inactive or attenuated 
molecule that is able to bind to the same site on the target RNA, but is unable to 
cleave that RNA. In another embodiment, inhibition with antisense oligonucleotides 
is preferably below that level observed in the presence of, for example, an 
oligonucleotide with scrambled sequence or with mismatches. In another 

30 embodiment, inhibition of CLCA1 genes with the nucleic acid molecule of the 
instant invention is greater than in the presence of the nucleic acid molecule than in 
its absence, or the presence of a control, irrelevant, or non-inhibitory 
oligonucleotide. 

By "enzymatic nucleic acid molecule" it is meant a nucleic acid molecule 
35 which has complementarity in a substrate binding region to a specified gene target, 



4 



MBHB00-814-A (249.021) 



and also has an enzymatic activity which is active to specifically cleave target RNA. 
That is, the enzymatic nucleic acid molecule is able to intermolecularly cleave RNA 
and thereby inactivate a target RNA molecule. These complementary regions allow 
sufficient hybridization of the enzymatic nucleic acid molecule to the target RNA 
5 and thus permit cleavage. One hundred percent complementarity is preferred, but 
complementarity as low as 50-75% may also be useful in this invention. The nucleic 
acids may be modified at the base, sugar, and/or phosphate groups. The term 
enzymatic nucleic acid is used interchangeably with phrases such as ribozymes, 
catalytic RNA, enzymatic RNA, catalytic DNA, aptazyme or aptamer-binding 

1 0 ribozyme, regulatable ribozyme, catalytic oligonucleotides, nucleozyme, DNAzyme, 
RNA enzyme, endoribonuclease, endonuclease, minizyme, leadzyme, oligozyme or 
DNA enzyme. All of these terminologies describe nucleic acid molecules with 
enzymatic activity. The specific enzymatic nucleic acid molecules described in the 
instant application are not meant to be limiting and those skilled in the art will 

1 5 recognize that all that is important in an enzymatic nucleic acid molecule of this 
invention is that it have a specific substrate binding site which is complementary to 
one or more of the target nucleic acid regions, and that it have nucleotide sequences 
within or surrounding that substrate binding site which impart a nucleic acid 
cleaving activity to the molecule (Cech et al, U.S. Patent No. 4,987,071; Cech et al, 

20 1988, JAMA). 

By "nucleic acid molecule" as used herein is meant a molecule having 
nucleotides. The nucleic acid can be single, double, or multiple stranded and may 
comprise modified or unmodified nucleotides or non-nucleotides or various mixtures 
and combinations thereof. 

25 By "enzymatic portion" or "catalytic domain" is meant that portion/region of 

the enzymatic nucleic acid molecule essential for cleavage of a nucleic acid substrate 
(for example, see Figures 1-4). 

By "substrate binding arm" or "substrate binding domain" is meant that 
portion/region of a ribozyme which is complementary to {i.e., able to base-pair with) 

30 a portion of its substrate. Generally, such complementarity is 100%, but can be less 
if desired. For example, as few as 10 bases out of 14 maybe base-paired. Examples 
of such arms are shown generally in Figures 1-4. That is, these arms contain 
sequences within a ribozyme which are intended to bring ribozyme and target RNA 
together through complementary base-pairing interactions. The ribozyme of the 

35 invention may have binding arms that are contiguous or non-contiguous and may be 
of varying lengths. The length of the binding arm(s) are preferably greater than or 
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equal to four nucleotides and of sufficient length to stably interact with the target 
RNA; specifically 12-100 nucleotides; more specifically 14-24 nucleotides long. If 
two binding arms are chosen, the design is such that the length of the binding arms 
are symmetrical (i.e., each of the binding arms is of the same length; e.g., five and 
5 five nucleotides, six and six nucleotides or seven and seven nucleotides long) or 
asymmetrical (i.e., the binding arms are of different length; e.g., six and three 
nucleotides; three and six nucleotides long; four and five nucleotides long; four and 
six nucleotides long; four and seven nucleotides long; and the like). 

By "NCH" or "Inozyme" motif is meant, an enzymatic nucleic acid molecule 
10 comprising a motif as described in Ludwig et al., USSN No. 09/406,643, filed 
September 27, 1999, entitled "COMPOSITIONS HAVING RNA CLEAVING 
ACTIVITY", and International PCT publication Nos. WO 98/58058 and WO 
98/58057, all incorporated by reference herein in their entirety including the 
drawings. 

15 By "G-cleaver" motif is meant, an enzymatic nucleic acid molecule 

comprising a motif as described in Eckstein et al, International PCT publication No. 
WO 99/16871, incorporated by reference herein in its entirety including the 
drawings. 

By "zinzyme" motif is meant, a class II enzymatic nucleic acid molecule 
20 comprising a motif as described in Beigelman et al, International PCT publication 
No. WO 99/55857, incorporated by reference herein in its entirety including the 
drawings. Zinzymes represent a non-limiting example of an enzymatic nucleic acid 
molecule that does not require a ribonucleotide (2' -OH) group within its own nucleic 
acid sequence for activity. 

25 By "amberzyme" motif is meant, a class I enzymatic nucleic acid molecule 

comprising a motif as described in Beigelman et al, International PCT publication 
No. WO 99/55857, incorporated by reference herein in its entirety including the 
drawings. Amberzymes represent a non-limiting example of an enzymatic nucleic 
acid molecule that does not require a ribonucleotide (2'-OH) group within its own 

30 nucleic acid sequence for activity. 

By 'DNAzyme' is meant, an enzymatic nucleic acid molecule that does not 
require the presence of a ribonucleotide (2' -OH) group within the DNAzyme 
molecule for its activity, hi particular embodiments the enzymatic nucleic acid 
molecule may have an attached linker(s) or other attached or associated groups, 
35 moieties, or chains containing one or more nucleotides with 2'-OH groups. 
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DNAzyme can be synthesized chemically or expressed endogenously in vivo, by 
means of a single stranded DNA vector or equivalent thereof. 

By "sufficient length" is meant an oligonucleotide of greater than or equal to 3 
nucleotides that is of a length great enough to provide the intended function under 
5 the expected condition. For example, for binding arms of enzymatic nucleic acid 
"sufficient length" means that the binding arm sequence is long enough to provide 
stable binding to a target site under the expected binding conditions. Preferably, the 
binding arms are not so long as to prevent useful turnover. 

By "stably interact" is meant, interaction of the oligonucleotides with target 
1 0 nucleic acid (e.g., by forming hydrogen bonds with complementary nucleotides in 
the target under physiological conditions). 

By "equivalent" RNA to CLCA1 is meant to include those naturally occurring 
RNA molecules having homology (partial or complete) to CLCA1 proteins or 
encoding for proteins with similar function as CLCA1 in various organisms, 
1 5 including human, rodent, primate, rabbit, pig, protozoans, fungi, plants, and other 
microorganisms and parasites. The equivalent RNA sequence also includes in 
addition to the coding region, regions such as 5 '-untranslated region, 3 '-untranslated 
region, introns, intron-exon junction and the like. 

By "homology" is meant the nucleotide sequence of two or more nucleic acid 
20 molecules is partially or completely identical. 

By "antisense nucleic acid", it is meant a non-enzymatic nucleic acid 
molecule that binds to target RNA by means of RNA-RNA or RNA-DNA or RNA- 
PNA (protein nucleic acid; Egholm et al, 1993 Nature 365, 566) interactions and 
alters the activity of the target RNA (for a review, see Stein and Cheng, 1993 

25 Science 261, 1004 and Woolf et al, US patent No. 5,849,902). Typically, antisense 
molecules will be complementary to a target sequence along a single contiguous 
sequence of the antisense molecule. However, in certain embodiments, an antisense 
molecule may bind to substrate such that the substrate molecule forms a loop, and/or 
an antisense molecule may bind such that the antisense molecule forms a loop. 

30 Thus, the antisense molecule may be complementary to two (or even more) non- 
contiguous substrate sequences or two (or even more) non-contiguous sequence 
portions of an antisense molecule may be complementary to a target sequence or 
both. For a review of current antisense strategies, see Schmajuk et al., 1999, J. Biol. 
Chem., 274, 21783-21789, Delihas et al, 1997, Nature, 15, 751-753, Stein et al, 
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1997, Antisense N. A. Drug Dev., 7, 151, Crooke, 1998, Biotech. Genet. Eng. Rev., 
15, 121-157, Crooke, 1997, Ad. Pharmacol, 40, 1-49. In addition, antisense DNA 
can be used to target RNA by means of DNA-RNA interactions, thereby activating 
RNase H, which digests the target RNA in the duplex. Antisense DNA can be 
5 synthesized chemically or expressed via the use of a single stranded DNA expression 
vector or equivalent thereof. 

By "2-5A antisense chimera" it is meant, an antisense oligonucleotide 
containing a 5'-phosphorylated 2'-5'-linked adenylate residue. These chimeras bind 
to target RNA in a sequence-specific manner and activate a cellular 2-5A-dependent 
10 ribonuclease which, in turn, cleaves the target RNA (Torrence et ah, 1993 Proc. 
Natl. Acad. Sci. USA 90, 1300). 

By "triplex DNA" it is meant an oligonucleotide that can bind to a double- 
stranded DNA in a sequence-specific manner to form a triple-strand helix. 
Formation of such triple helix structure has been shown to inhibit transcription of the 
1 5 targeted gene (Duval-Valentin et al, 1992 Proc. Natl. Acad. Sci. USA 89, 504). 

By "gene" it is meant a nucleic acid that encodes an RNA. 

By "complementarity" is meant that a nucleic acid can form hydrogen bond(s) 
with another RNA sequence by either traditional Watson-Crick or other non- 
traditional types, hi reference to the nucleic molecules of the present invention, the 

20 binding free energy for a nucleic acid molecule with its target or complementary 
sequence is sufficient to allow the relevant function of the nucleic acid to proceed, 
e.g., ribozyme cleavage, antisense or triple helix inhibition. Determination of 
binding free energies for nucleic acid molecules is well known in the art (see, e.g., 
Turner et al., 1987, CSH Symp. Quant. Biol. LE pp.123-133; Frier et al, 1986, Proc. 

25 Nat. Acad. Sci. USA 83:9373-9377; Turner et al, 1987, Am. Chem. Soc. 
109:3783-3785). A percent complementarity indicates the percentage of contiguous 
residues in a nucleic acid molecule which can form hydrogen bonds (e.g., Watson- 
Crick base pairing) with a second nucleic acid sequence (e.g., 5, 6, 7, 8, 9, 10 out of 
10 being 50%, 60%, 70%, 80%, 90%, and 100% complementary). "Perfectly 

30 complementary" means that all the contiguous residues of a nucleic acid sequence 
will hydrogen bond with the same number of contiguous residues in a second nucleic 
acid sequence. 

At least seven basic varieties of naturally occurring enzymatic nucleic acids 
are known presently. Each can catalyze the hydrolysis of RNA phosphodiester 
35 bonds in trans (and thus can cleave other RNA molecules) under physiological 
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conditions. Table I summarizes some of the characteristics of these ribozymes. In 
general, enzymatic nucleic acids act by first binding to a target RNA. Such binding 
occurs through the target binding portion of a enzymatic nucleic acid which is held 
in close proximity to an enzymatic portion of the molecule that acts to cleave the 
5 target RNA. Thus, the enzymatic nucleic acid first recognizes and then binds a 
target RNA through complementary base-pairing, and once bound to the correct site, 
acts enzymatically to cut the target RNA. Strategic cleavage of such a target RNA 
will destroy its ability to direct synthesis of an encoded protein. After an enzymatic 
nucleic acid has bound and cleaved its RNA target, it is released from that RNA to 

1 0 search for another target and can repeatedly bind and cleave new targets. Thus, a 
single ribozyme molecule is able to cleave many molecules of target RNA. In 
addition, the ribozyme is a highly specific inhibitor of gene expression, with the 
specificity of inhibition depending not only on the base-pairing mechanism of 
binding to the target RNA, but also on the mechanism of target RNA cleavage. 

1 5 Single mismatches, or base-substitutions, near the site of cleavage can completely 
eliminate catalytic activity of a ribozyme. 

The enzymatic nucleic acid molecule that cleave the specified sites in CLCA1- 
specific RNAs represent a novel therapeutic approach to treat Chronic Obstructive 
Pulmonary Diseases (COPDs), chronic bronchitis, asthma, cystic fibrosis, 
20 obstructive bowel syndrome, and other indications that may respond to the level of 
CLCA1. 

In one of the preferred embodiments of the inventions described herein, the 
enzymatic nucleic acid molecule is formed in a hammerhead or hairpin motif, but 
may also be formed in the motif of a hepatitis delta virus, group I intron, group II 

25 intron or RNase P RNA (in association with an RNA guide sequence), Neurospora 
VS RNA, DNAzymes, NCH cleaving motifs, or G-cleavers. Examples of such 
hammerhead motifs are described by Dreyfus, supra, Rossi et al, 1992, AIDS 
Research and Human Retroviruses 8, 183; Examples of hairpin motifs are described 
by Hampel et al, EP0360257, Hampel and Tritz, 1989 Biochemistry 28, 4929, 

30 Feldstein et al, 1989, Gene 82, 53, Haseloff and Gerlach, 1989, Gene, 82, 43, 
Hampel et al, 1990 Nucleic Acids Res. 18, 299; Chowrira & McSwiggen, US. 
Patent No. 5,631,359. The hepatitis delta virus motif is described by Perrotta and 
Been, 1992 Biochemistry 31, 16. The RNase P motif is described by Guerrier- 
Takada et al, 1983 Cell 35, 849; Forster and Airman, 1990, Science 249, 783; Li and 

35 Altaian, 1996, Nucleic Acids Res. 24, 835. Neurospora VS RNA ribozyme motif is 
described by Collins (Saville and Collins, 1990 Cell 61, 685-696; Saville and 
Collins, 1991 Proc. Natl. Acad. Sci. USA 88, 8826-8830; Collins and Olive, 1993 
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Biochemistry 32, 2795-2799; Guo and Collins, 1995, EMBO. J. 14, 363). Group II 
introns are described by Griffin et al, 1995, Chem. Biol. 2, 761; Michels and Pyle, 
1995, Biochemistry 34, 2965; Pyle et al, International PCT Publication No. WO 
96/22689. The Group I intron is described by Cech et al, U.S. Patent 4,987,071. 
5 DNAzymes are described by Usman et al., International PCT Publication No. WO 
95/11304; Chartrand et al, 1995, NAR 23, 4092; Breaker et al, 1995, Chem. Bio. 2, 
655; Santoro et al, 1997, PNAS 94, 4262. NCH cleaving motifs are described in 
Ludwig & Sproat, International PCT Publication No. WO 98/58058; and G-cleavers 
are described in Kore et al, 1998, Nucleic Acids Research 26, 4116-4120 and 

1 0 Eckstein et al, International PCT Publication No. WO 99/16871. Additional motifs 
such as the Aptazyme (Breaker et al, WO 98/43993), Amberzyme (Class I motif; 
Figure 3; Beigelman et al, International PCT publication No. WO 99/55857) and 
Zinzyme (Beigelman et al, International PCT publication No. WO 99/55857), all 
these references are incorporated by reference herein in their totalities, including 

1 5 drawings and can also be used in the present invention. These specific motifs are 
not limiting in the invention, and those skilled in the art will recognize that all that is 
important in an enzymatic nucleic acid molecule of this invention is that it has a 
specific substrate binding site which is complementary to one or more of the target 
gene RNA regions, and that it have nucleotide sequences within or surrounding that 

20 substrate binding site which impart an RNA cleaving activity to the molecule (Cech 
et al, U.S. Patent No. 4,987,071). 

In preferred embodiments of the present invention, a nucleic acid molecule, 
e.g., an antisense molecule, a triplex DNA, or a ribozyme, is 13 to 100 nucleotides in 
length, e.g., in specific embodiments 35, 36, 37, or 38 nucleotides in length {e.g., for 

25 particular ribozymes or antisense). In particular embodiments, the nucleic acid 
molecule is 15-100, 17-100, 20-100, 21-100, 23-100, 25-100, 27-100, 30-100, 32- 
100, 35-100, 40-100, 50-100, 60-100, 70-100, or 80-100 nucleotides in length. 
Instead of 100 nucleotides being the upper limit on the length ranges specified 
above, the upper limit of the length range can be, for example, 30, 40, 50, 60, 70, or 

30 80 nucleotides. Thus, for any of the length ranges, the length range for particular 
embodiments has lower limit as specified, with an upper limit as specified which is 
greater than the lower limit. For example, in a particular embodiment, the length 
range can be 35-50 nucleotides in length. All such ranges are expressly included. 
Also in particular embodiments, a nucleic acid molecule can have a length which is 

35 any of the lengths specified above, for example, 21 nucleotides in length. 

In a preferred embodiment, the invention provides a method for producing a 
class of nucleic acid-based gene inhibiting agents which exhibit a high degree of 
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specificity for the RNA of a desired target. For example, the enzymatic nucleic acid 
molecule is preferably targeted to a highly conserved sequence region of target 
RNAs encoding CLCA proteins (for example, CLCA1, CLCA2, CLCA3 and/or 
CLCA4) such that specific treatment of a disease or condition can be provided with 
either one or several nucleic acid molecules of the invention. Such nucleic acid 
molecules can be delivered exogenously to specific tissue or cellular targets as 
required. Alternatively, the nucleic acid molecules (e.g., ribozymes and antisense) 
can be expressed from DNA and/or RNA vectors that are delivered to specific cells. 

In a preferred embodiment, the invention features the use of nucleic acid-based 
inhibitors of the invention to specifically target genes that share homology with the 
CLCA1 gene. 

As used herein "cell" is used in its usual biological sense, and does not refer to 
an entire multicellular organism, e.g., specifically does not refer to a human. The 
cell may be present in a non-human multicellular organism, e.g., birds, plants and 
mammals such as cows, sheep, apes, monkeys, swine, dogs, and cats. 

By "CLCA proteins" is meant, a protein or a mutant protein derivative thereof, 
comprising a calcium activated chloride channel protein. 

By "highly conserved sequence region" is meant, a nucleotide sequence of one 
or more regions in a target gene does not vary significantly from one generation to 
the other or from one biological system to the other. 

The nucleic acid-based inhibitors of CLCA1 expression are useful for the 
prevention and/or treatment of diseases and conditions including Chronic 
Obstructive Pulmonary Disease (COPD), chronic bronchitis, asthma, cystic fibrosis, 
obstructive bowel syndrome, and any other diseases or conditions that are related to 
or will respond to the levels of CLCA1 in a cell or tissue, alone or in combination 
with other therapies. 

By "related" is meant that the reduction of CLCA1 expression (specifically 
CLCA1 gene) RNA levels and thus reduction in the level of the respective protein 
will relieve, to some extent, the symptoms of the disease or condition. 

The nucleic acid-based inhibitors of the invention are added directly, or can be 
complexed with cationic lipids, packaged within liposomes, or otherwise delivered 
to target cells or tissues. The nucleic acid or nucleic acid complexes can be locally 
administered to relevant tissues ex vivo, or in vivo through injection, infusion pump 
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or stent, with or without their incorporation in biopolymers. In preferred 
embodiments, the enzymatic nucleic acid inhibitors comprise sequences, which are 
complementary to the substrate sequences in Tables III to IX. Examples of such 
enzymatic nucleic acid molecules also are shown in Tables III to IX. Examples of 
5 such enzymatic nucleic acid molecules consist essentially of sequences defined in 
these Tables. 

In yet another embodiment, the invention features antisense nucleic acid 
molecules and 2-5A chimera including sequences complementary to the substrate 
sequences shown in Tables III to IX. Such nucleic acid molecules can include 
1 0 sequences as shown for the binding arms of the enzymatic nucleic acid molecules in 
Tables III to VIII and sequences shown as GeneBloc™ sequences in Table IX. 
Similarly, triplex molecules can be provided targeted to the corresponding DNA 
target regions, and containing the DNA equivalent of a target sequence or a sequence 
complementary to the specified target (substrate) sequence. Typically, antisense 
15 molecules will be complementary to a target sequence along a single contiguous 
sequence of the antisense molecule. However, in certain embodiments, an antisense 
molecule may bind to substrate such that the substrate molecule forms a loop, and/or 
an antisense molecule may bind such that the antisense molecule forms a loop. 
Thus, the antisense molecule may be complementary to two (or even more) non- 
20 contiguous substrate sequences or two (or even more) non-contiguous sequence 
portions of an antisense molecule may be complementary to a target sequence or 
both. 

By "consists essentially of is meant that the active nucleic acid molecule of 
the invention, for example, an enzymatic nucleic acid molecule, contains an 

25 enzymatic center or core equivalent to those in the examples, and binding arms able 
to bind RNA such that cleavage at the target site occurs. Other sequences can be 
present which do not interfere with such cleavage. Thus, a core region can, for 
example, include one or more loop, stem-loop structure, or linker which does not 
prevent enzymatic activity. Thus, the underlined regions in the sequences in Tables 

30 III, IV and VIII can be such a loop, stem-loop, nucleotide linker, and/or non- 
nucleotide linker and can be represented generally as sequence "X". For example, a 
core sequence for a hammerhead enzymatic nucleic acid can comprise a conserved 
sequence, such as 5'-CUGAUGAG-3' and 5'-CGAA-3' connected by "X", where X 
is 5 ' - GCCGUUAGGC -3 ' (SEQ ID NO 5450), or any other Stem H region known in 

35 the art. 
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In another aspect of the invention, ribozymes or antisense molecules that 
interact with target RNA molecules and inhibit CLCA1 (specifically CLCA1 gene) 
activity are expressed from transcription units inserted into DNA or RNA vectors. 
The recombinant vectors are preferably DNA plasmids or viral vectors. Ribozyme or 
5 antisense expressing viral vectors could be constructed based on, but not limited to, 
adeno-associated virus, retrovirus, adenovirus, or alphavirus. Preferably, the 
recombinant vectors capable of expressing the ribozymes or antisense are delivered 
as described above, and persist in target cells. Alternatively, viral vectors may be 
used that provide for transient expression of ribozymes or antisense. Such vectors 

10 can be repeatedly administered as necessary. Once expressed, the ribozymes or 
antisense bind to the target RNA and inhibit its function or expression. Delivery of 
ribozyme or antisense expressing vectors can be systemic, such as by intravenous or 
intramuscular administration, by administration to target cells ex-planted from the 
patient followed by reintroduction into the patient, or by any other means that would 

1 5 allow for introduction into the desired target cell. Antisense DNA can be expressed 
endogenously via the use of a single stranded DNA intracellular expression vector. 

By RNA is meant a molecule comprising at least one ribonucleotide residue. 
By "ribonucleotide" is meant a nucleotide with a hydroxyl group at the 2' position of 
a (3-D-ribo-furanose moiety. 

20 By "vectors" is meant any nucleic acid- and/or viral-based technique used to 

deliver a desired nucleic acid. 

By "patient" is meant an organism, which is a donor or recipient of explanted 
cells or the cells themselves. "Patient" also refers to an organism to which the 
nucleic acid molecules of the invention can be administered. Preferably, a patient is 
25 a mammal or mammalian cells. More preferably, a patient is a human or human 
cells. 

The nucleic acid molecules of the instant invention, individually, or in 
combination or in conjunction with other drugs, can be used to treat diseases or 
conditions discussed above. For example, to treat a disease or condition associated 
30 with the levels of CLCA1, the patient may be treated, or other appropriate cells may 
be treated, as is evident to those skilled in the art, individually or in combination 
with one or more drugs under conditions suitable for the treatment. 

In a further embodiment, the described molecules, such as antisense or 
ribozymes, can be used in combination with other known treatments to treat 
35 conditions or diseases discussed above. For example, the described molecules could 
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be used in combination with one or more known therapeutic agents to treat Chronic 
Obstructive Pulmonary Diseases (COPDs), chronic bronchitis, asthma, cystic 
fibrosis, obstructive bowel syndrome, and/or other disease states or conditions which 
respond to the modulation of CLCA1 expression. 

5 In another preferred embodiment, the invention features nucleic acid-based 

inhibitors (e.g., enzymatic nucleic acid molecules (ribozymes), antisense nucleic 
acids, 2-5A antisense chimeras, triplex DNA, antisense nucleic acids containing 
RNA cleaving chemical groups) and methods for their use to down regulate or 
inhibit the expression of genes (e.g., CLCA1) capable of progression and/or 
10 maintenance of Chronic Obstructive Pulmonary Diseases (COPDs), chronic 
bronchitis, asthma, cystic fibrosis, obstructive bowel syndrome, and/or other disease 
states or conditions which respond to the modulation of CLCA1 expression. 

By "comprising" is meant including, but not limited to, whatever follows the 
word "comprising". Thus, use of the term "comprising" indicates that the listed 

1 5 elements are required or mandatory, but that other elements are optional and may or 
may not be present. By "consisting of is meant including, and limited to, whatever 
follows the phrase "consisting of. Thus, the phrase "consisting of indicates that the 
listed elements are required or mandatory, and that no other elements may be 
present. By "consisting essentially of is meant including any elements listed after 

20 the phrase, and limited to other elements that do not interfere with or contribute to 
the activity or action specified in the disclosure for the listed elements. Thus, the 
phrase "consisting essentially of indicates that the listed elements are required or 
mandatory, but that other elements are optional and may or may not be present 
depending upon whether or not they affect the activity or action of the listed 

25 elements. 

The foregoing description of the various aspects and embodiments is 
provided with reference to the exemplary calcium activated chloride channel gene 
CLCA1, which is also referred to as CaCCl or ICACC-L However, the various 
aspects and embodiments are also directed to other genes which express CLCA1 or 
30 CaCCl -like proteins (for example hCLCA2, hCLCA3, hCLCA4, CaCC2, and 
CaCC3). Those additional genes can be analyzed for target sites using the methods 
described for CLC Al . Thus, the inhibition and the effects of such inhibition of the 
other genes can be performed as described herein. 

Other features and advantages of the invention will be apparent from the 
35 following description of the preferred embodiments thereof, and from the claims. 
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Description Of The Preferred Embodiments 
First the drawings will be described briefly. 
Drawings 

Figure 1 shows examples of chemically stabilized ribozyme motifs. HH Rz, 
5 represents hammerhead ribozyme motif (Usman et al, 1996, Curr. Op. Struct. Bio., 
1, 527); NCH Rz represents the NCH ribozyme motif (Ludwig & Sproat, 
International PCT Publication No. WO 98/58058); G-Cleaver, represents G-cleaver 
ribozyme motif (Kore et al, 1998, Nucleic Acids Research 26, 41 16-4120). N or n, 
represent independently a nucleotide which may be same or different and have 

10 complementarity to each other; rl, represents ribo-Inosine nucleotide; arrow 
indicates the site of cleavage within the target. Position 4 of the HH Rz and the NCH 
Rz is shown as having 2'-C-allyl modification, but those skilled in the art will 
recognize that this position can be modified with other modifications well known in 
the art, so long as such modifications do not significantly inhibit the activity of the 

1 5 ribozyme. 

Figure 2 shows an example of the Amberzyme ribozyme motif that is 
chemically stabilized (see, for example, Beigelman et al, International PCT 
publication No. WO 99/55857, incorporated by reference herein; also referred to as 
Class I Motif). The Amberzyme motif is a class of enzymatic nucleic molecules that 
20 do not require the presence of a ribonucleotide (2' -OH) group for its activity. 

Figure 3 shows an example of the Zinzyme A ribozyme motif that is 
chemically stabilized (Beigelman et al, International PCT publication No. WO 
99/55857, incorporated by reference herein; also referred to as Class A or Class II 
Motif). The Zinzyme motif is a class of enzymatic nucleic molecules that do not 
25 require the presence of a ribonucleotide (2'-OH) group for its activity. 

Figure 4 shows an example of a DNAzyme motif described by Santoro et al, 
1997, PNAS, 94, 4262. 

Figures 5A and 5B are diagrammatic schemes representative of the process 
used for Target Discovery in the instant invention. The process for Target Discovery 
30 is described in Jarvis et al, International PCT publication No. WO 98/50530, 
incorporated by reference herein in its entirety including the Figures. 

Mechanism of action of Nucleic Acid Molecules of the Invention 
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Antisense : Antisense molecules may be modified or unmodified RNA, DNA, 
or mixed polymer oligonucleotides which primarily function by specifically binding 
to matching sequences resulting in inhibition of peptide synthesis (Wu-Pong, Nov 
1994, BioPharm, 20-33). The antisense oligonucleotide binds to target RNA by 
5 Watson Crick base-pairing and blocks gene expression by preventing ribosomal 
translation of the bound sequences either by steric blocking or by activating RNase 
H enzyme. Antisense molecules can also alter protein synthesis by interfering with 
RNA processing or transport from the nucleus into the cytoplasm (Mukhopadhyay & 
Roth, 1996, Crit. Rev. in Oncogenesis 7, 151-190). 

10 In addition, binding of single stranded DNA to RNA may result in nuclease 

degradation of the heteroduplex (Wu-Pong, supra; Crooke, supra). To date, the only 
backbone modified DNA chemistry which will act as substrates for RNase H are 
phosphorothioates, phosphorodithioates, and borontrifluoridates. Recently it has 
been reported that 2'-arabino and 2'-fluoro arabino- containing oligos can also 

1 5 activate RNase H activity. 

A number of antisense molecules have been described that utilize novel 
configurations of chemically modified nucleotides, secondary structure, and/or 
RNase H substrate domains (Woolf et al, International PCT Publication No. WO 
98/13526; Thompson et al, International PCT Publication No. WO 99/54459; 
20 Hartmann et al, USSN 60/101,174 which was filed on September 21, 1998) all of 
these are incorporated by reference herein in their entirety. 

In addition, antisense deoxyoligoribonucleotides can be used to target RNA by 
means of DNA-RNA interactions, thereby activating RNase H, which digests the 
target RNA in the duplex. Antisense DNA can be expressed endogenously in vivo 
25 via the use of a single stranded DNA intracellular expression vector or equivalents 
and variations thereof. 

Triplex Forming Oligonucleotides (TFO ): Single stranded DNA may be 
designed to bind to genomic DNA in a sequence specific manner. TFOs are 
comprised of pyrimidine-rich oligonucleotides which bind DNA helices through 
30 Hoogsteen Base-pairing (Wu-Pong, supra). The resulting triple helix composed of 
the DNA sense, DNA antisense, and TFO disrupts RNA synthesis by RNA 
polymerase. The TFO mechanism may result in gene expression or cell death since 
binding may be irreversible (Mukhopadhyay & Roth, supra). 

2-5A Antisense Chimera : The 2-5A system is an interferon mediated 
35 mechanism for RNA degradation found in higher vertebrates (Mitra et al, 1996, 
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Proc Nat Acad Sci USA 93, 6780-6785). Two types of enzymes, 2-5A synthetase 
and RNase L, are required for RNA cleavage. The 2-5 A synthetases require double 
stranded RNA to form 2'-5' oligoadenylates (2-5 A). 2-5 A then acts as an allosteric 
effector for utilizing RNase L which has the ability to cleave single stranded RNA. 
The ability to form 2-5A structures with double stranded RNA makes this system 
particularly useful for inhibition of viral replication. 

(2'-5') oligoadenylate structures may be covalently linked to antisense 
molecules to form chimeric oligonucleotides capable of RNA cleavage (Torrence, 
supra). These molecules putatively bind and activate a 2-5A dependent RNase, the 
oligonucleotide/enzyme complex then binds to a target RNA molecule which can 
then be cleaved by the RNase enzyme. 

Enzymatic Nucleic Acid : Seven basic varieties of naturally occurring 
enzymatic RNAs are presently known. In addition, several in vitro selection 
(evolution) strategies (Orgel, 1979, Proc. R. Soc. London, B 205, 435) have been 
used to evolve new nucleic acid catalysts capable of catalyzing cleavage and ligation 
of phosphodiester linkages (Joyce, 1989, Gene, 82, 83-87; Beaudry et al, 1992, 
Science 257, 635-641; Joyce, 1992, Scientific American 267, 90-97; Breaker et al, 
1994, TIB TECH 12, 268; Bartel et al, 1993, Science 261:1411-1418; Szostak, 1993, 
TIBS 17, 89-93; Kumar et al, 1995, FASEB J., 9, 1183; Breaker, 1996, Curr. Op. 
Biotech., 7, 442; Santoro et al, 1997, Proc. Natl. Acad. Sci., 94, 4262; Tang et al, 
1997, RNA 3, 914; Nakamaye & Eckstein, 1994, supra; Long & Uhlenbeck, 1994, 
supra; Ishizaka et al, 1995, supra; Vaish et al, 1997, Biochemistry 36, 6495; all of 
these are incorporated by reference herein). Each can catalyze a series of reactions 
including the hydrolysis of phosphodiester bonds in trans (and thus can cleave other 
RNA molecules) under physiological conditions. 

Nucleic acid molecules of this invention will block to some extent CLCA1 
protein expression and can be used to treat disease or diagnose disease associated 
with the levels of CLCA1. 

The enzymatic nature of a ribozyme has significant advantages, such as the 
concentration of ribozyme necessary to affect a therapeutic treatment is lower. This 
advantage reflects the ability of the ribozyme to act enzymatically. Thus, a single 
ribozyme molecule is able to cleave many molecules of target RNA. In addition, the 
ribozyme is a highly specific inhibitor, with the specificity of inhibition depending 
not only on the base-pairing mechanism of binding to the target RNA, but also on 
the mechanism of target RNA cleavage. Single mismatches, or base-substitutions, 
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near the site of cleavage can be chosen to completely eliminate catalytic activity of a 
ribozyme. 

Nucleic acid molecules having an endonuclease enzymatic activity are able to 
repeatedly cleave other separate RNA molecules in a nucleotide base sequence- 
specific manner. Such enzymatic nucleic acid molecules can be targeted to virtually 
any RNA transcript, and achieve efficient cleavage in vitro (Zaug et al, 324, Nature 
429 1986 ; Uhlenbeck, 1987 Nature 328, 596; Kim et al, 84 Proc. Natl. Acad. Sci. 
USA 8788, 1987; Dreyfus, 1988, Einstein Quart. J. Bio. Med., 6, 92; Haseloff and 
Gerlach, 334 Nature 585, 1988; Cech, 260 JAMA 3030, 1988; and Jefferies et al, 17 
Nucleic Acids Research 1371, 1989; Santoro et al., 1997 supra). 

Because of their sequence specificity, Jrarcs-cleaving ribozymes show promise 
as therapeutic agents for human disease (Usman and McSwiggen, 1995 Ann. Rep. 
Med. Chem. 30, 285-294; Christoffersen and Marr, 1995 J. Med. Chem. 38, 2023- 
2037). Ribozymes can be designed to cleave specific RNA targets within the 
background of cellular RNA. Such a cleavage event renders the RNA non- 
functional and abrogates protein expression from that RNA. In this manner, 
synthesis of a protein associated with a disease state can be selectively inhibited 
(Warashina et al, 1999, Chemistry and Biology, 6, 237-250). 

The nucleic acid molecules of the instant invention are also referred to as 
GeneBloc reagents, which are essentially nucleic acid molecules (e.g.; ribozymes, 
antisense) capable of down-regulating gene expression. 

GeneBlocs are modified oligonucleotides including ribozymes and modified 
antisense oligonucleotides that bind to and target specific mRNA molecules. 
Because GeneBlocs can be designed to target any specific mRNA, their potential 
applications are quite broad. Traditional antisense approaches have often relied 
heavily on the use of phosphorothioate modifications to enhance stability in 
biological samples, leading to a myriad of specificity problems stemming from non- 
specific protein binding and general cytotoxicity (Stein, 1995, Nature Medicine, 1, 
1119). In contrast, GeneBlocs contain a number of modifications that confer 
nuclease resistance while making minimal use of phosphorothioate linkages, which 
reduces toxicity, increases binding affinity and minimizes non-specific effects 
compared with traditional antisense oligonucleotides. Similar reagents have recently 
been utilized successfully in various cell culture systems (Vassar, et al, 1999, 
Science, 286, 735) and in vivo (Jarvis et al., manuscript in preparation). In addition, 
novel cationic lipids can be utilized to enhance cellular uptake in the presence of 
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serum. Since ribozymes and antisense oligonucleotides regulate gene expression at 
the RNA level, the ability to maintain a steady-state dose of GeneBloc over several 
days was important for target protein and phenotypic analysis. The advances in 
resistance to nuclease degradation and prolonged activity in vitro have supported the 
use of GeneBlocs in target validation applications. 

Target sites 

Targets for useful ribozymes and antisense nucleic acids can be determined 
as disclosed in Draper et al, WO 93/23569; Sullivan et al, WO 93/23057; 
Thompson et al, WO 94/02595; Draper et al, WO 95/04818; McSwiggen et al, US 
Patent No. 5,525,468. All of these publications are hereby incorporated by reference 
herein in their totality. Other examples include the following PCT applications, 
which concern inactivation of expression of disease-related genes: WO 95/23225, 
WO 95/13380, WO 94/02595, all of which are incorporated by reference herein. 
Rather than repeat the guidance provided in those documents here, specific examples 
of such methods are provided herein, not limiting to those in the art. Ribozymes and 
antisense to such targets are designed as described in those applications and 
synthesized to be tested in vitro and in vivo, as also described. The sequences of 
human CLCA1 RNAs were screened for optimal enzymatic nucleic acid and 
antisense target sites using a computer-folding algorithm. Antisense, hammerhead, 
DNAzyme, NCH, amberzyme, zinzyme, or G-Cleaver ribozyme binding/cleavage 
sites were identified. These sites are shown in Tables III to IX (all sequences are 5' 
to 3' in the tables; the underlined region can be any base-paired sequence, the actual 
sequence is not relevant here). The nucleotide base position is noted in the Tables as 
that site to be cleaved by the designated type of enzymatic nucleic acid molecule. 
While human sequences can be screened and enzymatic nucleic acid molecule 
and/or antisense thereafter designed, as discussed in Stinchcomb et al, WO 
95/23225, mouse targeted ribozymes may be useful to test efficacy of action of the 
enzymatic nucleic acid molecule and/or antisense prior to testing in humans. 

Antisense, hammerhead, DNAzyme, NCH, amberzyme, zinzyme or G-Cleaver 
ribozyme binding/cleavage sites were identified. The nucleic acid molecules are 
individually analyzed by computer folding (Jaeger et al, 1989 Proc. Natl. Acad. Sci. 
USA, 86, 7706) to assess whether the sequences fold into the appropriate secondary 
structure. Those nucleic acid molecules with unfavorable intramolecular 
interactions such as between the binding arms and the catalytic core are eliminated 
from consideration. Varying binding arm lengths can be chosen to optimize activity. 
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Antisense, hammerhead, DNAzyme, NCH, amberzyme, zinzyme or G-Cleaver 
ribozyme binding/cleavage sites were identified and were designed to anneal to 
various sites in the RNA target. The binding arms are complementary to the target 
site sequences described above. The nucleic acid molecules were chemically 
synthesized. The method of synthesis used follows the procedure for normal 
DNA/RNA synthesis as described below and in Usman et al, 1987 J. Am. Chem. 
Soc, 109, 7845; Scaringe et al, 1990 Nucleic Acids Res., 18, 5433; Wincott et al, 
1995 Nucleic Acids Res. 23, 2677-2684; and Caruthers et al, 1992, Methods in 
Enzymology 211,3-19. 

Synthesis of Nucleic acid Molecules 

Synthesis of nucleic acids greater than 100 nucleotides in length is difficult 
using automated methods, and the therapeutic cost of such molecules is prohibitive. 
In this invention, small nucleic acid motifs ("small refers to nucleic acid motifs no 
more than 100 nucleotides in length, preferably no more than 80 nucleotides in 
length, and most preferably no more than 50 nucleotides in length; e.g., antisense 
oligonucleotides, hammerhead or the NCH ribozymes) are preferably used for 
exogenous delivery. The simple structure of these molecules increases the ability of 
the nucleic acid to invade targeted regions of RNA structure. Exemplary molecules 
of the instant invention are chemically synthesized, and others can similarly be 
synthesized. 

Oligonucleotides (e.g.; antisense GeneBlocs) are synthesized using protocols 
known in the art as described in Caruthers et al, 1992, Methods in Enzymology 211, 
3-19, Thompson et al, International PCT Publication No. WO 99/54459, Wincott et 
al, 1995, Nucleic Acids Res. 23, 2677-2684, Wincott et al, 1997, Methods Mol. 
Bio., 74, 59, Brennan et al, 1998, Biotechnol Bioeng., 61, 33-45, and Brennan, US 
patent No. 6,001,311. All of these references are incorporated herein by reference. 
The synthesis of oligonucleotides makes use of common nucleic acid protecting and 
coupling groups, such as dimethoxytrityl at the 5'-end, and phosphoramidites at the 
3'-end. hi a non-limiting example, small scale syntheses are conducted on a 394 
Applied Biosystems, Inc. synthesizer using a 0.2 umol scale protocol with a 2.5 min 
coupling step for 2'-0-methylated nucleotides and a 45 sec coupling step for 2'- 
deoxy nucleotides. Table II outlines the amounts and the contact times of the 
reagents used in the synthesis cycle. Alternatively, syntheses at the 0.2 umol scale 
can be performed on a 96-well plate synthesizer, such as the instrument produced by 
Protogene (Palo Alto, CA) with minimal modification to the cycle. A 33-fold excess 
(60 uL of 0.11 M = 6.6 umol) of 2'-0-methyl phosphoramidite and a 105-fold 
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excess of S-ethyl tetrazole (60 uL of 0.25 M = 15 umol) can be used in each 
coupling cycle of 2'-0-methyl residues relative to polymer-bound 5'-hydroxyl. A 
22-fold excess (40 uL of 0.11 M = 4.4 umol) of deoxy phosphoramidite and a 70- 
fold excess of S-ethyl tetrazole (40 uL of 0.25 M = 10 umol) can be used in each 
5 coupling cycle of deoxy residues relative to polymer-bound 5'-hydroxyl. Average 
coupling yields on the 394 Applied Biosystems, Inc. synthesizer, determined by 
colorimetric quantitation of the trityl fractions, are typically 97.5-99%. Other 
oligonucleotide synthesis reagents for the 394 Applied Biosystems, Inc. synthesizer 
include; detritylation solution is 3% TCA in methylene chloride (ABI); capping is 

10 performed with 16% iV-methyl imidazole in THF (ABI) and 10% acetic 
anhydride/10% 2,6-lutidine in THF (ABI); and oxidation solution is 16.9 mM 12, 49 
mM pyridine, 9% water in THF (PERSEPTIVE™). Burdick & Jackson Synthesis 
Grade acetonitrile is used directly from the reagent bottle. S-Ethyltetrazole solution 
(0.25 M in acetonitrile) is made up from the solid obtained from American 

1 5 International Chemical, Inc. Alternately, for the introduction of phosphorothioate 
linkages, Beaucage reagent (3H-l,2-Benzodithiol-3-one 1,1 -dioxide, 0.05 M in 
acetonitrile) is used. 

Deprotection of the antisense oligonucleotides is performed as follows: the 
polymer-bound trityl-on oligoribonucleotide is transferred to a 4 mL glass screw top 
20 vial and suspended in a solution of 40% aq. methylamine (1 mL) at 65 °C for 10 
min. After cooling to -20 °C, the supernatant is removed from the polymer support. 
The support is washed three times with 1.0 mL of EtOH:MeCN:H20/3:l:l, vortexed 
and the supernatant is then added to the first supernatant. The combined 
supernatants, containing the oligoribonucleotide, are dried to a white powder. 

25 The method of synthesis used for normal RNA including certain enzymatic 

nucleic acid molecules follows the procedure as described in Usman et al, 1987, J. 
Am. Chem. Soc, 109, 7845; Scaringe et al., 1990, Nucleic Acids Res., 18, 5433; 
Wincott et al, 1995, Nucleic Acids Res. 23, 2677-2684 and Wincott et al, 1997, 
Methods Mol. Bio., 74, 59, and makes use of common nucleic acid protecting and 

30 coupling groups, such as dimethoxytrityl at the 5'-end, and phosphoramidites at the 
3'-end. In a non-limiting example, small scale syntheses are conducted on a 394 
Applied Biosystems, Inc. synthesizer using a 0.2 umol scale protocol with a 7.5 min 
coupling step for alkylsilyl protected nucleotides and a 2.5 min coupling step for 2'- 
O-methylated nucleotides. Table II outlines the amounts and the contact times of 

35 the reagents used in the synthesis cycle. Alternatively, syntheses at the 0.2 jimol 
scale can be done on a 96-well plate synthesizer, such as the instrument produced by 
Protogene (Palo Alto, CA) with minimal modification to the cycle. A 33-fold excess 
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(60 uL of 0.11 M = 6.6 umol) of 2'-0-methyl phosphoramidite and a 75-fold excess 
of S-ethyl tetrazole (60 uL of 0.25 M = 15 umol) can be used in each coupling cycle 
of 2'-0-methyl residues relative to polymer-bound 5'-hydroxyl. A 66-fold excess 
(120 uL of 0.1 1 M = 13.2 umol) of alkylsilyl (ribo) protected phosphoramidite and a 
150-fold excess of S-ethyl tetrazole (120 uL of 0.25 M = 30 umol) can be used in 
each coupling cycle of ribo residues relative to polymer-bound 5'-hydroxyl. 
Average coupling yields on the 394 Applied Biosystems, Inc. synthesizer, 
determined by colorimetric quantitation of the trityl fractions, are typically 97.5- 
99%. Other oligonucleotide synthesis reagents for the 394 Applied Biosystems, Inc. 
synthesizer include; detritylation solution is 3% TCA in methylene chloride (ABI); 
capping is performed with 16% N-mefhyl imidazole in THF (ABI) and 10% acetic 
anhydride/10% 2,6-lutidine in THF (ABI); oxidation solution is 16.9 mM 12, 49 mM 
pyridine, 9% water in THF (PERSEPTIVE™). Burdick & Jackson Synthesis Grade 
acetonitrile is used directly from the reagent bottle. S-Ethyltetrazole solution (0.25 
M in acetonitrile) is made up from the solid obtained from American International 
Chemical, Inc. Alternately, for the introduction of phosphorothioate linkages, 
Beaucage reagent (3H-l,2-Benzodithiol-3-one l,l-dioxide0.05 M in acetonitrile) is 
used. 

Deprotection of the RNA is performed using either a two-pot or one-pot 
protocol. For the two-pot protocol, the polymer-bound trityl-on oligoribonucleotide 
is transferred to a 4 mL glass screw top vial and suspended in a solution of 40% aq. 
methylamine (1 mL) at 65 °C for 10 min. After cooling to -20 °C, the supernatant is 
removed from the polymer support. The support is washed three times with 1 .0 mL 
of EtOH:MeCN:H20/3:l:l, vortexed and the supernatant is then added to the first 
supernatant. The combined supernatants, containing the oligoribonucleotide, are 
dried to a white powder. The base deprotected oligoribonucleotide is resuspended in 
anhydrous TEA/HF/NMP solution (300 uL of a solution of 1.5 mL N- 
methylpyrrolidinone, 750 uL TEA and 1 mL TEA-3HF to provide a 1.4 M HF 
concentration) and heated to 65 °C. After 1.5 h, the oligomer is quenched with 1.5 
MNH4HCO3. 

Alternatively, for the one-pot protocol, the polymer-bound trityl-on 
oligoribonucleotide is transferred to a 4 mL glass screw top vial and suspended in a 
solution of 33% ethanolic methylamine/DMSO: 1/1 (0.8 mL) at 65 °C for 15 min. 
The vial is brought to r.t. TEA»3HF (0.1 mL) is added and the vial is heated at 65 °C 
for 15 min. The sample is cooled at -20 °C and then quenched with 1.5 M 
NH4HCO3. 
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For purification of the trityl-on oligomers, the quenched NH4HCO3 solution is 
loaded onto a C-18 containing cartridge that had been prewashed with acetonitrile 
followed by 50 mM TEAA. After washing the loaded cartridge with water, the RNA 
is detritylated with 0.5% TFA for 13 min. The cartridge is then washed again with 
water, salt exchanged with 1 M NaCl and washed with water again. The 
oligonucleotide is then eluted with 30% acetonitrile. 

Inactive hammerhead ribozymes or binding attenuated control (BAC) 
oligonucleotides) are synthesized by substituting a U for G5 and a U for A14 
(numbering from Hertel, K. J., et al, 1992, Nucleic Acids Res,., 20, 3252). Similarly, 
one or more nucleotide substitutions can be introduced in other enzymatic nucleic 
acid molecules to inactivate the molecule and such molecules can serve as a negative 
control. 

The average stepwise coupling yields are typically >98% (Wincott et al, 1995 
Nucleic Acids Res. 23, 2677-2684). Those of ordinary skill in the art will recognize 
that the scale of synthesis can be adapted to be larger or smaller than the examples 
described above including but not limited to 96-well format, all that is important is 
the ratio of chemicals used in the reaction. 

Alternatively, the nucleic acid molecules of the present invention can be 
synthesized separately and joined together post-synthetically, for example by ligation 
(Moore et al, 1992, Science 256, 9923; Draper et al., International PCT publication 
No. WO 93/23569; Shabarova et al, 1991, Nucleic Acids Research 19, 4247; Bellon 
et al, 1997, Nucleosides & Nucleotides, 16, 951; Bellon et al, 1997, Bioconjugate 
Chem. 8, 204). 

The nucleic acid molecules of the present invention are modified extensively 
to enhance stability by modification with nuclease resistant groups, for example, 2'- 
amino, 2'-C-allyl, 2'-flouro, 2'-0-methyl, 2'-H (for a review see Usman and 
Cedergren, 1992, TIBS 17, 34; Usman et al., 1994, Nucleic Acids Symp. Ser. 31, 
163). Ribozymes are purified by gel electrophoresis using general methods or are 
purified by high pressure liquid chromatography (HPLC; See Wincott et al, supra, 
the totality of which is hereby incorporated herein by reference) and are re- 
suspended in water. 

The sequences of the ribozymes and antisense constructs that are chemically 
synthesized, useful in this study, are shown in Tables III to IX. Those in the art 
will recognize that these sequences are representative only of many more such 
sequences where the enzymatic portion of the ribozyme (all but the binding arms) is 
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altered to affect activity. The ribozyme and antisense construct sequences listed in 
Tables III to IX may be formed of ribonucleotides or other nucleotides or non- 
nucleotides. Such ribozymes with enzymatic activity are equivalent to the ribozymes 
described specifically in the Tables. 

Optimizing Activity of the nucleic acid molecule of the invention. 

Chemically synthesizing nucleic acid molecules with modifications (base, 
sugar and/or phosphate) that prevent their degradation by serum ribonucleases may 
increase their potency (see e.g., Eckstein et ah, International Publication No. 
WO 92/07065; Perrault et ah, 1990 Nature 344, 565; Pieken et ah, 1991, Science 
253, 314; Usman and Cedergren, 1992, Trends in Biochem. Sci. 17, 334; Usman et 
ah, International Publication No. WO 93/15187; Rossi et ah, International 
Publication No. WO 91/03162; Sproat, US Patent No. 5,334,711; and Burgin et ah, 
supra; all of these describe various chemical modifications that can be made to the 
base, phosphate and/or sugar moieties of the nucleic acid molecules described 
herein. All these references are incorporated by reference herein. Modifications 
which enhance their efficacy in cells, and removal of bases from nucleic acid 
molecules to shorten oligonucleotide synthesis times and reduce chemical 
requirements are desired. 

There are several examples in the art describing sugar, base and phosphate 
modifications that can be introduced into nucleic acid molecules with significant 
enhancement in their nuclease stability and efficacy. For example, oligonucleotides 
are modified to enhance stability and/or enhance biological activity by modification 
with nuclease resistant groups, for example, 2'-amino, 2'-C-allyl, 2'-flouro, T-O- 
methyl, 2'-H, nucleotide base modifications (for a review see Usman and Cedergren, 
1992, TIBS. 17, 34; Usman et al, 1994, Nucleic Acids Symp. Ser. 31, 163; Burgin et 
ah, 1996, Biochemistry , 35, 14090). Sugar modifications of nucleic acid molecules 
have been extensively described in the art (see Eckstein et ah, International 
Publication PCT No. WO 92/07065; Perrault et ah Nature, 1990, 344, 565-568; 
Pieken et ah Science, 1991, 253, 314-317; Usman and Cedergren, Trends in 
Biochem. Sci. , 1992, 17, 334-339; Usman et ah International Publication PCT No. 
WO 93/15187; Sproat, US Patent No. 5,334,711 and Beigelman et ah, 1995, J. 
Biol. Chem., 270, 25702; Beigelman et ah, International PCT publication No. WO 
97/26270; Beigelman et ah, US Patent No. 5,716,824; Usman et ah, US patent No. 
5,627,053; Woolf et ah, International PCT Publication No. WO 98/13526; 
Thompson et ah, USSN 60/082,404 which was filed on April 20, 1998; Karpeisky et 
ah, 1998, Tetrahedron Lett., 39, 1131; Earnshaw and Gait, 1998, Biopolymers 
(Nucleic acid Sciences), 48, 39-55; Verma and Eckstein, 1998, Annu. Rev. Biochem., 
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67, 99-134; and Burlina et al, 1997, Bioorg. Med. Chem., 5, 1999-2010; all of the 
references are hereby incorporated by reference herein in their totalities). Such 
publications describe general methods and strategies to determine the location of 
incorporation of sugar, base and/or phosphate modifications and the like into 
5 ribozymes without inhibiting catalysis. In view of such teachings, similar 
modifications can be used as described herein to modify the nucleic acid molecules 
of the instant invention. 

While chemical modification of oligonucleotide internucleotide linkages with 
phosphorothioate, phosphorothioate, and/or 5'-methylphosphonate linkages 
10 improves stability, too many of these modifications may cause some toxicity. 
Therefore when designing nucleic acid molecules the amount of these 
internucleotide linkages should be minimized. The reduction in the concentration of 
these linkages should lower toxicity resulting in increased efficacy and higher 
specificity of these molecules. 

15 Nucleic acid molecules having chemical modifications which maintain or 

enhance activity are provided. Such nucleic acid is also generally more resistant to 
nucleases than unmodified nucleic acid. Thus, in a cell and/or in vivo the activity 
may not be significantly lowered. Therapeutic nucleic acid molecules delivered 
exogenously must optimally be stable within cells until translation of the target RNA 

20 has been inhibited long enough to reduce the levels of the undesirable protein. This 
period of time varies between hours to days depending upon the disease state. 
Clearly, nucleic acid molecules must be resistant to nucleases in order to function as 
effective intracellular therapeutic agents. Improvements in the chemical synthesis of 
RNA and DNA (Wincott et al, 1995 Nucleic Acids Res. 23, 2677; Caruthers et al, 

25 1992, Methods in Enzymology 211,3-19 (incorporated by reference herein) have 
expanded the ability to modify nucleic acid molecules by introducing nucleotide 
modifications to enhance their nuclease stability as described above. 

Use of these the nucleic acid-based molecules of the invention will lead to 
better treatment of the disease progression by affording the possibility of 

30 combination therapies {e.g., multiple antisense or enzymatic nucleic acid molecules 
targeted to different genes, nucleic acid molecules coupled with known small 
molecule inhibitors, or intermittent treatment with combinations of molecules 
(including different motifs) and/or other chemical or biological molecules). The 
treatment of patients with nucleic acid molecules may also include combinations of 

35 different types of nucleic acid molecules. 
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Therapeutic nucleic acid molecules {e.g., enzymatic nucleic acid molecules 
and antisense nucleic acid molecules) delivered exogenously must optimally be 
stable within cells until translation of the target RNA has been inhibited long enough 
to reduce the levels of the undesirable protein. This period of time varies between 
hours to days depending upon the disease state. Clearly, these nucleic acid 
molecules must be resistant to nucleases in order to function as effective intracellular 
therapeutic agents. Improvements in the chemical synthesis of nucleic acid 
molecules described in the instant invention and in the art have expanded the ability 
to modify nucleic acid molecules by introducing nucleotide modifications to enhance 
their nuclease stability as described above. 

By "enhanced enzymatic activity" is meant to include activity measured in 
cells and/or in vivo where the activity is a reflection of both catalytic activity and 
ribozyme stability. In this invention, the product of these properties is increased or 
not significantly (less than 10-fold) decreased in vivo compared to an all RNA 
ribozyme or all DNA enzyme. 

In yet another preferred embodiment, nucleic acid catalysts having chemical 
modifications which maintain or enhance enzymatic activity are provided. Such 
nucleic acid is also generally more resistant to nucleases than unmodified nucleic 
acid. Thus, in a cell and/or in vivo the activity may not be significantly lowered. As 
exemplified herein such ribozymes are useful in a cell and/or in vivo even if activity 
over all is reduced 10 fold (Burgin et al, 1996, Biochemistry, 35, 14090). Such 
ribozymes herein are said to "maintain" the enzymatic activity of an all RNA 
ribozyme. 

In another aspect the nucleic acid molecules comprise a 5' and/or a 3'- cap 
structure. 

By "cap structure" is meant chemical modifications, which have been 
incorporated at either terminus of the oligonucleotide (see, for example, Wincott et 
al, WO 97/26270, incorporated by reference herein). These terminal modifications 
protect the nucleic acid molecule from exonuclease degradation, and may help in 
delivery and/or localization within a cell. The cap may be present at the 5 '-terminus 
(5' -cap) or at the 3 '-terminus (3 '-cap) or may be present on both termini. In non- 
limiting examples the 5 '-cap is selected from the group comprising inverted abasic 
residue (moiety), 4',5'-methylene nucleotide; l-(beta-D-erythrofuranosyl) nucleotide, 
4'-thio nucleotide, carbocyclic nucleotide; 1 ,5-anhydrohexitol nucleotide; L- 
nucleotides; alpha-nucleotides; modified base nucleotide; phosphorodithioate 
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linkage; */?reo-pentofuranosyl nucleotide; acyclic 3',4'-seco nucleotide; acyclic 3,4- 
dihydroxybutyl nucleotide; acyclic 3,5-dihydroxypentyl nucleotide, 3'-3'-inverted 
nucleotide moiety; 3 '-3 '-inverted abasic moiety; 3'-2'-inverted nucleotide moiety; 3'- 
2'-inverted abasic moiety; 1,4-butanediol phosphate; 3'-phosphoramidate; 
5 hexylphosphate; aminohexyl phosphate; 3'-phosphate; 3'-phosphorothioate; 
phosphorodithioate; or bridging or non-bridging methylphosphonate moiety (for 
more details see Wincott et al, International PCT publication No. WO 97/26270, 
incorporated by reference herein). 

In yet another preferred embodiment, the 3 '-cap is selected from a group 
10 comprising, 4',5-methylene nucleotide; l-(beta-D-erythrofuranosyl) nucleotide; 4'- 
thio nucleotide, carbocyclic nucleotide; 5'-amino-alkyl phosphate; l,3-diamino-2- 
propyl phosphate, 3-aminopropyl phosphate; 6-aminohexyl phosphate; 1,2- 
aminododecyl phosphate; hydroxypropyl phosphate; 1,5-anhydrohexitol nucleotide; 
L-nucleotide; alpha-nucleotide; modified base nucleotide; phosphorodithioate; 
15 ^reo-pentofuranosyl nucleotide; acyclic 3',4'-seco nucleotide; 3,4-dihydroxybutyl 
nucleotide; 3,5-dihydroxypentyl nucleotide, 5'-5'-inverted nucleotide moiety; 5'-5'- 
inverted abasic moiety; 5'-phosphoramidate; 5'-phosphorothioate; 1,4-butanediol 
phosphate; 5'-amino; bridging and/or non-bridging 5'-phosphoramidate, 
phosphorothioate and/or phosphorodithioate, bridging or non bridging 
20 methylphosphonate and 5'-mercapto moieties (for more details, see Beaucage and 
Iyer, 1993, Tetrahedron 49, 1925; incorporated by reference herein). 

By the term "non-nucleotide" is meant any group or compound which can be 
incorporated into a nucleic acid chain in the place of one or more nucleotide units, 
including either sugar and/or phosphate substitutions, and allows the remaining 
25 bases to exhibit their enzymatic activity. The group or compound is abasic in that it 
does not contain a commonly recognized nucleotide base, such as adenosine, 
guanine, cytosine, uracil or thymine. 

An "alkyl" group refers to a saturated aliphatic hydrocarbon, including 
straight-chain, branched-chain, and cyclic alkyl groups. Preferably, the alkyl group 

30 has 1 to 12 carbons. More preferably it is a lower alkyl of from 1 to 7 carbons, more 
preferably 1 to 4 carbons. The alkyl group may be substituted or unsubstituted. 
When substituted the substituted group(s) is preferably, hydroxyl, cyano, alkoxy, 
=0, =S, N02 or N(CH3)2, amino, or SH. The term also includes alkenyl groups 
which are unsaturated hydrocarbon groups containing at least one carbon-carbon 

35 double bond, including straight-chain, branched-chain, and cyclic groups. 
Preferably, the alkenyl group has 1 to 12 carbons. More preferably it is a lower 
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alkenyl of from 1 to 7 carbons, more preferably 1 to 4 carbons. The alkenyl group 
may be substituted or unsubstituted. When substituted the substituted group(s) is 
preferably, hydroxyl, cyano, alkoxy, =0, =S, NO2, halogen, N(CH3)2, amino, or 
SH. The term "alkyl" also includes alkynyl groups which have an unsaturated 
5 hydrocarbon group containing at least one carbon-carbon triple bond, including 
straight-chain, branched-chain, and cyclic groups. Preferably, the alkynyl group has 
1 to 12 carbons. More preferably it is a lower alkynyl of from 1 to 7 carbons, more 
preferably 1 to 4 carbons. The alkynyl group may be substituted or unsubstituted. 
When substituted the substituted group(s) is preferably, hydroxyl, cyano, alkoxy, 
1 0 =0, =S, NO2 or N(CH3)2, amino or SH. 

Such alkyl groups may also include aryl, alkylaryl, carbocyclic aryl, 
heterocyclic aryl, amide and ester groups. An "aryl" group refers to an aromatic 
group which has at least one ring having a conjugated n electron system and includes 
carbocyclic aryl, heterocyclic aryl and biaryl groups, all of which may be optionally 

15 substituted. The preferred substituent(s) of aryl groups are halogen, trihalomethyl, 
hydroxyl, SH, OH, cyano, alkoxy, alkyl, alkenyl, alkynyl, and amino groups. An 
"alkylaryl" group refers to an alkyl group (as described above) covalently joined to 
an aryl group (as described above). Carbocyclic aryl groups are groups wherein the 
ring atoms on the aromatic ring are all carbon atoms. The carbon atoms are 

20 optionally substituted. Heterocyclic aryl groups are groups having from 1 to 3 
heteroatoms as ring atoms in the aromatic ring and the remainder of the ring atoms 
are carbon atoms. Suitable heteroatoms include oxygen, sulfur, and nitrogen, and 
include furanyl, thienyl, pyridyl, pyrrolyl, N-lower alkyl pyrrolo, pyrimidyl, 
pyrazinyl, imidazolyl and the like, all optionally substituted. An "amide" refers to an 

25 -C(0)-NH-R, where R is either alkyl, aryl, alkylaryl or hydrogen. An "ester" refers 
to an -C(0)-OR*, where R is either alkyl, aryl, alkylaryl or hydrogen. 

By "nucleotide" as used herein is as recognized in the art to include natural 
bases (standard), and modified bases well known in the art. Such bases are generally 
located at the 1* position of a nucleotide sugar moiety. Nucleotides generally 

30 comprise a base, sugar and a phosphate group. The nucleotides can be unmodified or 
modified at the sugar, phosphate and/or base moiety, (also referred to 
interchangeably as nucleotide analogs, modified nucleotides, non-natural 
nucleotides, non-standard nucleotides and other; see for example, Usman and 
McSwiggen, supra; Eckstein et ah, International PCT Publication No. WO 

35 92/07065; Usman et al, International PCT Publication No. WO 93/15 187; Uhlmann 
& Peyman, 1990, Chemical Reviews, 90, 4, 544-579, all are hereby incorporated by 
reference herein). There are several examples of modified nucleic acid bases known 
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in the art as summarized by Limbach et al., 1994, Nucleic Acids Res. 22, 2183. 
Some of the non-limiting examples of base modifications that can be introduced into 
nucleic acid molecules include, inosine, purine, pyridin-4-one, pyridin-2-one, 
phenyl, pseudouracil, 2, 4, 6-trimethoxy benzene, 3-methyl uracil, dihydrouridine, 
5 naphthyl, aminophenyl, 5-alkylcytidines (e.g., 5-methylcytidine), 5-alkyluridines 
(e.g., ribothymidine), 5-halouridine (e.g., 5-bromouridine) or 6-azapyrimidines or 6- 
alkylpyrimidines (e.g. 6-methyluridine), propyne, and others (Burgin et al., 1996, 
Biochemistry, 35, 14090; Uhlman & Peyman, supra). By "modified bases" in this 
aspect is meant nucleotide bases other than adenine, guanine, cytosine and uracil at 
10 1' position or their equivalents; such bases may be used at any position, for example, 
within the catalytic core of an enzymatic nucleic acid molecule and/or in the 
substrate-binding regions of the nucleic acid molecule. 

In a preferred embodiment, the invention features modified ribozymes with 
phosphate backbone modifications comprising one or more phosphorothioate, 

15 phosphorodithioate, methylphosphonate, morpholino, amidate carbamate, 
carboxymethyl, acetamidate, polyamide, sulfonate, sulfonamide, sulfamate, 
formacetal, thioformacetal, and/or alkylsilyl, substitutions. For a review of 
oligonucleotide backbone modifications see Hunziker and Leumann, 1995, Nucleic 
Acid Analogues: Synthesis and Properties, in Modern Synthetic Methods, VCH, 

20 331-417, and Mesmaeker et al, 1994, Novel Backbone Replacements for 
Oligonucleotides, in Carbohydrate Modifications in Antisense Research, ACS, 24- 
39. These references are hereby incorporated by reference herein. 

By "abasic" is meant sugar moieties lacking a base or having other chemical 
groups in place of a base at the 1' position, (for more details, see Wincott et al., 
25 International PCT publication No. WO 97/26270). 

By "unmodified nucleoside" is meant one of the bases adenine, cytosine, 
guanine, thymine, uracil joined to the 1' carbon of p-D-ribo-furanose. 

By "modified nucleoside" is meant any nucleotide base which contains a 
modification in the chemical structure of an unmodified nucleotide base, sugar 
30 and/or phosphate. 

In connection with 2 '-modified nucleotides as described for the present 
invention, by "amino" is meant 2'-NH2 or 2'-0- NH2, which may be modified or 
unmodified. Such modified groups are described, for example, in Eckstein et al, 
U.S. Patent 5,672,695 and Matulic-Adamic et al, WO 98/28317, respectively, which 
35 are both incorporated by reference herein in their entireties. 
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Various modifications to nucleic acid (e.g., antisense and ribozyme) structure 
can be made to enhance the utility of these molecules. Such modifications will 
enhance shelf-life, half-life in vitro, stability, and ease of introduction of such 
oligonucleotides to the target site, e.g., to enhance penetration of cellular 
5 membranes, and confer the ability to recognize and bind to targeted cells. 

Use of these molecules will lead to better treatment of the disease progression 
by affording the possibility of combination therapies (e.g., multiple ribozymes 
targeted to different genes, ribozymes coupled with known small molecule 
inhibitors, or intermittent treatment with combinations of ribozymes (including 
10 different ribozyme motifs) and/or other chemical or biological molecules). The 
treatment of patients with nucleic acid molecules may also include combinations of 
different types of nucleic acid molecules. Therapies may be devised which include a 
mixture of ribozymes (including different ribozyme motifs), antisense and/or 2-5 A 
chimera molecules to one or more targets to alleviate symptoms of a disease. 

15 Administration of Nucleic Acid Molecules 

Methods for the delivery of nucleic acid molecules are described in Akhtar et 
al., 1992, Trends Cell Bio., 2, 139; and Delivery Strategies for Antisense 
Oligonucleotide Therapeutics, ed. Akhtar, 1995 which are both incorporated herein 
by reference. Sullivan et al, PCT WO 94/02595, further describes the general 

20 methods for delivery of enzymatic RNA molecules. These protocols may be utilized 
for the delivery of virtually any nucleic acid molecule. Nucleic acid molecules may 
be administered to cells by a variety of methods known to those familiar to the art, 
including, but not restricted to, encapsulation in liposomes, by iontophoresis, or by 
incorporation into other vehicles, such as hydrogels, cyclodextrins, biodegradable 

25 nanocapsules, and bioadhesive microspheres. For some indications, nucleic acid 
molecules may be directly delivered ex vivo to cells or tissues with or without the 
aforementioned vehicles. Alternatively, the nucleic acid/vehicle combination is 
locally delivered by direct injection or by use of a catheter, infusion pump or stent. 
Other routes of delivery include, but are not limited to, intravascular, intramuscular, 

30 subcutaneous or joint injection, aerosol inhalation, oral (tablet or pill form), topical, 
systemic, ocular, intraperitoneal and/or intrathecal delivery. More detailed 
descriptions of nucleic acid delivery and administration are provided in Sullivan et 
al., supra, Draper et al, PCT W093/23569, Beigelman et al, PCT WO99/05094, 
and Klimuk et al, PCT WO99/04819 all of which have been incorporated by 

35 reference herein. 
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In addition, the nucleic acid molecules of the instant invention, used to treat 
pulmonary diseases and disorders, may be administered directly to the lungs via 
pulmonary delivery. The pulmonary delivery of oligonucleotides is described by 
Bennett et al, International PCT publication Nos. WO/9960166 and WO/9960010; 
5 Danahay et al, 1999, Pharm. Res., 16(10), 1542-1549; Metzger and Nyce, 1999, J. 
Allergy Clin. Immunol, 104(2, Pt. 1), 260-266; Nicklin et al, 1998, Pharm. Res., 
15(4), 583-591; Ilium and Watts, International PCT publication No. WO/9735562; 
and Nyce, 1997, Expert Opin. Invest. Drugs, 6(9), 1 149-1 156. 

The molecules of the instant invention can be used as pharmaceutical agents. 
1 0 Pharmaceutical agents prevent, inhibit the occurrence, or treat (alleviate a symptom 
to some extent, preferably all of the symptoms) of a disease state in a patient. 

The negatively charged polynucleotides of the invention can be administered 
{e.g., RNA, DNA or protein) and introduced into a patient by any standard means, 
with or without stabilizers, buffers, and the like, to form a pharmaceutical 
1 5 composition. When it is desired to use a liposome delivery mechanism, standard 
protocols for formation of liposomes can be followed. The compositions of the 
present invention may also be formulated and used as tablets, capsules or elixirs for 
oral administration; suppositories for rectal administration; sterile solutions; 
suspensions for injectable administration; and other compositions known in the art. 

20 The present invention also includes pharmaceutically acceptable formulations 

of the compounds described. These formulations include salts of the above 
compounds, e.g., acid addition salts, including salts of hydrochloric, hydrobromic, 
acetic acid, and benzene sulfonic acid. 

A pharmacological composition or formulation refers to a composition or 
25 formulation in a form suitable for administration, e.g., systemic administration, into 
a cell or patient, preferably a human. Suitable forms, in part, depend upon the use or 
the route of entry, for example oral, transdermal, or by injection. Such forms should 
not prevent the composition or formulation from reaching a target cell (i.e., a cell to 
which the negatively charged polymer is desired to be delivered to). For example, 
30 pharmacological compositions injected into the blood stream should be soluble. 
Other factors are known in the art, and include considerations such as toxicity and 
forms which prevent the composition or formulation from exerting its effect.By 
"systemic administration" is meant in vivo systemic absorption or accumulation of 
drugs in the blood stream followed by distribution throughout the entire body. 
35 Administration routes that lead to systemic absorption include, without limitations: 
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intravenous, subcutaneous, intraperitoneal, inhalation, oral, intrapulmonary and 
intramuscular. Each of these administration routes exposes the desired negatively 
charged polymers, e.g., nucleic acids, to an accessible diseased tissue. The rate of 
entry of a drug into the circulation has been shown to be a function of molecular 
5 weight or size. The use of a liposome or other drug carrier comprising the 
compounds of the instant invention can potentially localize the drug, for example, in 
certain tissue types, such as the tissues of the reticular endothelial system (RES). A 
liposome formulation that can facilitate the association of drug with the surface of 
cells, such as, lymphocytes and macrophages is also useful. This approach may 
1 0 provide enhanced delivery of the drug to target cells by taking advantage of the 
specificity of macrophage and lymphocyte immune recognition of abnormal cells, 
such as cancer cells. 

By pharmaceutically acceptable formulation is meant, a composition or 
formulation that allows for the effective distribution of the nucleic acid molecules of 

1 5 the instant invention in the physical location most suitable for their desired activity. 
Non-limiting examples of agents suitable for formulation with the nucleic acid 
molecules of the instant invention include: P-glycoprotein inhibitors (such as 
Pluronic P85) which can enhance entry of drugs into the CNS (Jolliet-Riant and 
Tillement, 1999, Fundam. Clin. Pharmacol., 13, 16-26); biodegradable polymers, 

20 such as poly (DL-lactide-coglycolide) microspheres for sustained release delivery 
after intracerebral implantation (Emerich, DF et al, 1999, Cell Transplant, 8, 47-58) 
Alkermes, Inc. Cambridge, MA; and loaded nanoparticles, such as those made of 
polybutylcyanoacrylate, which can deliver drugs across the blood brain barrier and 
can alter neuronal uptake mechanisms {Prog Neuropsychopharmacol Biol 

25 Psychiatry, 23, 941-949, 1999). Other non-limiting examples of delivery strategies 
for the nucleic acid molecules of the instant invention include material described in 
Boado et al, 1998, J. Pharm. Sci., 87, 1308-1315; Tyler et al, 1999, FEBS Lett, 
421, 280-284; Pardridge et al, 1995, PNAS USA., 92, 5592-5596; Boado, 1995, Adv. 
Drug Delivery Rev., 15, 73-107; Aldrian-Herrada et al, 1998, Nucleic Acids Res., 

30 26, 4910-4916; and Tyler et al, 1999, PNAS USA., 96, 7053-7058. 

The invention also features the use of the composition comprising surface- 
modified liposomes containing poly (ethylene glycol) lipids (PEG-modified, or long- 
circulating liposomes or stealth liposomes). These formulations offer a method for 
increasing the accumulation of drugs in target tissues. This class of drug carriers 
35 resists opsonization and elimination by the mononuclear phagocytic system (MPS or 
RES), thereby enabling longer blood circulation times and enhanced tissue exposure 
for the encapsulated drug (Lasic et al. Chem. Rev. 1995, 95, 2601-2627; Ishiwata et 
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al, Chem. Pharm. Bull. 1995, 43, 1005-1011). All incorporated by reference herein. 
Such liposomes have been shown to accumulate selectively in tumors, presumably 
by extravasation and capture in the neovascularized target tissues (Lasic et al, 
Science 1995, 267, 1275-1276; Oku et al, 1995, Biochim. Biophys. Acta, 1238, 86- 
5 90). All incorporated by reference herein. The long-circulating liposomes enhance 
the pharmacokinetics and pharmacodynamics of DNA and RNA, particularly 
compared to conventional cationic liposomes which are known to accumulate in 
tissues of the MPS (Liu et al, J. Biol. Chem. 1995, 42, 24864-24870; Choi et al, 
International PCT Publication No. WO 96/10391; Ansell et al, International PCT 
10 Publication No. WO 96/10390; Holland et al, International PCT Publication No. 
WO 96/10392; all of which are incorporated by reference herein). Long-circulating 
liposomes are also likely to protect drugs from nuclease degradation to a greater 
extent compared to cationic liposomes, based on their ability to avoid accumulation 
in metabolically aggressive MPS tissues such as the liver and spleen. 

15 In addition, the invention features the use of methods to deliver the nucleic 

acid molecules of the instant invention to hematopoietic cells, including monocytes 
and lymphocytes. These methods are described in detail by Hartmann et al, 1998, J. 
Phamacol. Exp. Ther., 285(2), 920-928; Kronenwett et al, 1998, Blood, 91(3), 852- 
862; Filion and Phillips, 1997, Biochim. Biophys. Acta., 1329(2), 345-356; Ma and 

20 Wei, 1996, Leuk. Res., 20(1 1/12), 925-930; and Bongartz et al, 1994, Nucleic Acids 
Research, 22(22), 4681-8. Such methods, as described above, include the use of free 
oligonucleotide, cationic lipid formulations, liposome formulations including pH 
sensitive liposomes and immunoliposomes, and bioconjugates including 
oligonucleotides conjugated to fusogenic peptides, for the transfection of 

25 hematopoietic cells with oligonucleotides. 

The present invention also includes compositions prepared for storage or 
administration which include a pharmaceutically effective amount of the desired 
compounds in a pharmaceutically acceptable carrier or diluent. Acceptable carriers 
or diluents for therapeutic use are well known in the pharmaceutical art, and are 
30 described, for example, in Remington's Pharmaceutical Sciences, Mack Publishing 
Co. (A.R. Gennaro edit. 1985) hereby incorporated by reference herein. For 
example, preservatives, stabilizers, dyes and flavoring agents may be provided. 
These include sodium benzoate, sorbic acid and esters of /?-hydroxybenzoic acid. In 
addition, antioxidants and suspending agents may be used. 

35 A pharmaceutically effective dose is that dose required to prevent, inhibit the 

occurrence, or treat (alleviate a symptom to some extent, preferably all of the 
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symptoms) of a disease state. The pharmaceutically effective dose depends on the 
type of disease, the composition used, the route of administration, the type of 
mammal being treated, the physical characteristics of the specific mammal under 
consideration, concurrent medication, and other factors which those skilled in the 
5 medical arts will recognize. Generally, an amount between 0.1 mg/kg and 100 
mg/kg body weight/day of active ingredients is administered dependent upon 
potency of the negatively charged polymer. 

The nucleic acid molecules of the present invention may also be administered 
to a patient in combination with other therapeutic compounds to increase the overall 

10 therapeutic effect. The use of multiple compounds to treat an indication may 
increase the beneficial effects while reducing the presence of side effects. Oxygen 
therapy, bronchodilators, corticosteroids, antibacterials, vaccinations, acetylcysteine, 
mucokinetic agents, and DNase (Pulmozyrne) are non-limiting examples of 
compounds and/or methods that can be combined with or used in conjunction with 

1 5 the nucleic acid molecules (e.g. ribozymes and antisense molecules) of the instant 
invention. Those skilled in the art will recognize that other drug compounds and 
therapies can be similarly and readily combined with the nucleic acid molecules of 
the instant invention (e.g. ribozymes and antisense molecules) and are, therefore, 
within the scope of the instant invention. 

20 Alternatively, certain of the nucleic acid molecules of the instant invention can 

be expressed within cells from eukaryotic promoters (e.g., Izant and Weintraub, 
1985, Science, 229, 345; McGarry and Lindquist, 1986, Proc. Natl. Acad. Set, USA 
83, 399; Scanlon et al, 1991, Proc. Natl. Acad. Sci. USA, 88, 10591-5; Kashani- 
Sabet et al, 1992, Antisense Res. Dev., 2, 3-15; Dropulic et al, 1992, J. Virol, 66, 

25 1432-41; Weerasinghe et al, 1991, J. Virol, 65, 5531-4; Ojwang et al, 1992, 
Proc. Natl. Acad. Sci. USA, 89, 10802-6; Chen et al, 1992, Nucleic Acids Res., 20, 
4581-9; Sarver et al, 1990 Science, 247, 1222-1225; Thompson et al, 1995, 
Nucleic Acids Res., 23, 2259; Good et al, 1997, Gene Therapy, 4, 45; all of the 
references are hereby incorporated in their totality by reference herein). Those 

30 skilled in the art realize that any nucleic acid can be expressed in eukaryotic cells 
from the appropriate DNA/RNA vector. The activity of such nucleic acids can be 
augmented by their release from the primary transcript by a ribozyme (Draper et al, 
PCT WO 93/23569, and Sullivan et al, PCT WO 94/02595; Ohkawa et al, 1992, 
Nucleic Acids Symp. Ser., 27, 15-6; Taira et al, 1991, Nucleic Acids Res., 19, 5125- 

35 30; Ventura et al, 1993, Nucleic Acids Res., 21, 3249-55; Chowrira et al, 1994, J. 
Biol. Chem., 269, 25856; all of these references are hereby incorporated in their 
totalities by reference herein). 
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In another aspect of the invention, RNA molecules of the present invention are 
preferably expressed from transcription units (see, for example, Couture et ah, 1996, 
TIG., 12, 510) inserted into DNA or RNA vectors. The recombinant vectors are 
preferably DNA plasmids or viral vectors. Ribozyme expressing viral vectors could 
5 be constructed based on, but not limited to, adeno-associated virus, retrovirus, 
adenovirus, or alphavirus. Preferably, the recombinant vectors capable of expressing 
the nucleic acid molecules are delivered as described above, and persist in target 
cells. Alternatively, viral vectors may be used that provide for transient expression 
of nucleic acid molecules. Such vectors might be repeatedly administered as 

1 0 necessary. Once expressed, the nucleic acid molecule binds to the target mRNA. 
Delivery of nucleic acid molecule expressing vectors could be systemic, such as by 
intravenous or intra-muscular administration, by administration to target cells ex- 
planted from the patient followed by reintroduction into the patient, or by any other 
means that would allow for introduction into the desired target cell (for a review, 

1 5 see Couture et al, 1996, TIG., 12, 510). 

In one aspect, the invention features an expression vector comprising a nucleic 
acid sequence encoding at least one of the nucleic acid molecules disclosed in the 
instant invention. The nucleic acid sequence encoding the nucleic acid molecule of 
the instant invention is operably linked in a manner which allows expression of that 
20 nucleic acid molecule. 

In another aspect, the invention features an expression vector comprising: a) a 
transcription initiation region (e.g., eukaryotic pol I, II or HI initiation region); b) a 
transcription termination region (e.g., eukaryotic pol I, II or III termination region); 
c) a nucleic acid sequence encoding at least one of the nucleic acid catalyst of the 

25 instant invention; and wherein said sequence is operably linked to said initiation 
region and said termination region, in a manner which allows expression and/or 
delivery of said nucleic acid molecule. The vector may optionally include an open 
reading frame (ORF) for a protein operably linked on the 5' side or the 3'-side of the 
sequence encoding the nucleic acid catalyst of the invention; and/or an intron 

30 (intervening sequences). 

Transcription of the nucleic acid molecule sequences are driven from a 
promoter for eukaryotic RNA polymerase I (pol I), RNA polymerase II (pol IT), or 
RNA polymerase m (pol HI). Transcripts from pol II or pol in promoters will be 
expressed at high levels in all cells; the levels of a given pol II promoter in a given 
35 cell type will depend on the nature of the gene regulatory sequences (enhancers, 
silencers, etc.) present nearby. Prokaryotic RNA polymerase promoters are also 
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used, providing that the prokaryotic RNA polymerase enzyme is expressed in the 
appropriate cells (Elroy-Stein and Moss, 1990, Proc. Natl. Acad. Sci. USA, 87, 
6743-7; Gao and Huang 1993, Nucleic Acids Res.., 21, 2867-72; Lieber et al, 
1993, Methods Enzymol, 217, 47-66; Zhou et al, 1990, Mol. Cell. Biol, 10, 4529- 
5 37). All of these references are incorporated by reference herein. 

Several investigators have demonstrated that nucleic acid molecules, such as 
ribozymes expressed from such promoters can function in mammalian cells (e.g. 
Kashani-Sabet et al., 1992, Antisense Res. Dev., 2, 3-15; Ojwang et al, 1992, 
Proc. Natl. Acad. Sci. USA, 89, 10802-6; Chen et al, 1992, Nucleic Acids Res., 20, 

1 0 4581-9; Yu et al, 1993, Proc. Natl. Acad. Sci. USA, 90, 6340-4; L'Huillier et al, 
1992, EMBOJ., 11, 4411-8; Lisziewicz et al, 1993, Proc. Natl Acad. Sci. U. S. A, 
90, 8000-4; Thompson et al, 1995, Nucleic Acids Res., 23, 2259; and Sullenger & 
Cech, 1993, Science, 262, 1566). More specifically, transcription units such as the 
ones derived from genes encoding U6 small nuclear (snRNA), transfer RNA (tRNA) 

1 5 and adenovirus VA RNA are useful in generating high concentrations of desired 
RNA molecules such as ribozymes in cells (Thompson et al, supra; Couture and 
Stinchcomb, 1996, supra; Noonberg et al, 1994, Nucleic Acid Res., 22, 2830; 
Noonberg et al, US Patent No. 5,624,803; Good et al, 1997, Gene Ther., 4, 45; and 
Beigelman et al, International PCT Publication No. WO 96/18736; all of these 

20 publications are incorporated by reference herein. The above ribozyme transcription 
units can be incorporated into a variety of vectors for introduction into mammalian 
cells, including but not restricted to, plasmid DNA vectors, viral DNA vectors (such 
as adenovirus or adeno-associated virus vectors), or viral RNA vectors (such as 
retroviral or alphavirus vectors) (for a review, see Couture and Stinchcomb, 1996, 

25 supra). 

In yet another aspect, the invention features an expression vector comprising a 
nucleic acid sequence encoding at least one of the nucleic acid molecules of the 
invention, in a manner which allows expression of that nucleic acid molecule. The 
expression vector comprises in one embodiment; a) a transcription initiation region; 
30 b) a transcription termination region; c) a nucleic acid sequence encoding at least 
one said nucleic acid molecule; and wherein said sequence is operably linked to said 
initiation region and said termination region, in a manner which allows expression 
and/or delivery of said nucleic acid molecule. 

In another preferred embodiment, the expression vector comprises: a) a 
35 transcription initiation region; b) a transcription termination region; c) an open 
reading frame; d) a nucleic acid sequence encoding at least one said nucleic acid 
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molecule, wherein said sequence is operably linked to the 3'-end of said open 
reading frame; and wherein said sequence is operably linked to said initiation region, 
said open reading frame and said termination region, in a manner which allows 
expression and/or delivery of said nucleic acid molecule. 

5 In yet another embodiment the expression vector comprises: a) a transcription 

initiation region; b) a transcription termination region; c) an intron; d) a nucleic acid 
sequence encoding at least one said nucleic acid molecule; and wherein said 
sequence is operably linked to said initiation region, said intron and said termination 
region, in a manner which allows expression and/or delivery of said nucleic acid 
1 0 molecule. 

In another embodiment, the expression vector comprises: a) a transcription 
initiation region; b) a transcription termination region; c) an intron; d) an open 
reading frame; e) a nucleic acid sequence encoding at least one said nucleic acid 
molecule, wherein said sequence is operably linked to the 3 '-end of said open 
1 5 reading frame; and wherein said sequence is operably linked to said initiation region, 
said intron, said open reading frame and said termination region, in a manner which 
allows expression and/or delivery of said nucleic acid molecule. 

Examples. 

The following are non-limiting examples showing the selection, isolation, 
20 synthesis and activity of nucleic acids of the instant invention. 

The following examples demonstrate the selection and design of Antisense, 
hammerhead, DNAzyme, NCH, Amberzyme, Zinzyme, or G-Cleaver ribozyme 
molecules and binding/cleavage sites within CLCA1 RNA. 

Example 1 : Reporter System 

25 Applicant used a target discovery and target validation approach to finding 

genes that are involved in chronic mucous hypersecretion. In order to discover genes 
playing a role in the expression of mucins, a readily assayable reporter system was 
devised. The reporter system consists of a plasmid construct, termed pMUC5AC- 
EGFP, bearing a gene coding for Green Fluorescent Protein (GFP). The promoter 

30 region of the GFP gene is replaced by a portion of the Mucin 5AC promoter 
sufficient to direct efficient transcription of the GFP gene. The plasmid also 
contains the neomycin drug resistance gene. 
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Example 2: Host Cell Line for Target Discovery 

The cell line selected as host for these studies, NCI-H292 (ATCC CRL-1848), 
is derived from a human lung muco epidermoid carcinoma. The cells retain 
mucoepidermoid characteristics in culture and endogenously express mucin 5AC 
5 and mucin 2. The pMUC5AC-EGFP plasmid was transfected into NCI-H292 using 
a cationic lipid formulation. Following transfection, the cells were subjected to 
limiting dilution cloning under selection by 600 ^ig/mL Geneticin. Cells retaining 
the pMUC5AC-EGFP plasmid survive the Geneticin treatment and form colonies 
derived from single surviving cells. The resulting clonal cell lines were screened by 
1 0 flow cytometry for the capacity to upregulate GFP production directed by the Mucin 
5AC promoter. Treating the cells with sterilized M9 bacterial medium in which 
Pseudomonas aeruginosa had been cultured (Pseudomonas conditioned medium, 
PCM) induced the mucin promoter. The PCM is supplemented with phorbol 
myristate acetate (PMA). 

15 A clonal cell line highly responsive to mucin promoter induction, designated 

H292/MUC5 AC/EGFP Clone8 (H292 Clone 8) was selected as the reporter line for 
subsequent studies. The process for Target Discovery is described in Jarvis et ah, 
International PCT publication No. WO 98/50530, incorporated by reference herein 
in its entirety including the Figures. 

20 Example 3: Ribozyme Library Construction 

A ribozyme library was constructed with oligonucletides containing ribozymes 
with two randomized regions comprising six-nucleotide binding "arms" (Stem I and 
Stem HI of a ribozyme-substrate complex). Oligo sequence 5' and 3' of the 
ribozyme contains restriction endonuclease cleavage sites for cloning. The 3' 

25 trailing sequence forms a stem-loop for priming DNA polymerase extension to form 
a double stranded molecule. The double-stranded ribozyme library was cloned into 
the U6+27 transcription unit located in the 5' LTR region of a retroviral vector 
containing the human nerve growth factor receptor (hNGFr) reporter gene. 
Positioning the U6+27/ribozyme transcription unit in the 5' LTR results in a 

30 duplication of the transcription unit when the vector integrates into the host cell 
genome. As a result, the ribozyme is transcribed by RNA polymerase in from 
U6+27 and by RNA polymerase II activity directed by the 5' LTR. The ribozyme 
library was packaged into retroviral particles that were used to infect and transduce 
H292 Clone 8 cells. Assay of the hNGFr reporter indicated that 50% to 60% of 

35 Clone 8 cells incorporated the ribozyme construct. Figure 5A and 5B describe the 
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generalized scheme used in the ribozyme library construction and target discovery. 
By "randomized region" is meant a region of completely random sequence and/or 
partially random sequence. By completely random sequence is meant a sequence 
wherein theoretically there is equal representation of A, T, G and C nucleotides or 
5 modified derivatives thereof, at each position in the sequence. By partially random 
sequence is meant a sequence wherein there is an unequal representation of A, T, G 
and C nucleotides or modified derivatives thereof, at each position in the sequence. 
A partially random sequence can therefore have one or more positions of complete 
randomness and one or more positions with defined nucleotides. 

10 Example 4: Enriching for Non-responders to Mucin Induction 

Sorting of ribozyme library-containing cells was performed to enrich for cells 
that produce less GFP after treatment with PCM and PMA. Lower GFP production 
may be due to ribozyme action upon genes involved in the activation of the mucin 
promoter. Alternatively, ribozymes may directly target the mucin/GFP transcript 
1 5 resulting in reduced GFP expression. 

Cells were seeded at a density of 1 x 10 6 per 150 cm 2 style cell culture flasks. 
After 72 hours the standard cell culture medium was replaced with medium without 
fetal bovine serum. After 24 hours of serum deprivation the cells were treated with 
serum-containing medium supplemented with PCM (to 40%) and PMA (to 50 nM) 
20 to induced GFP production via the mucin promoter. After 20 to 22 hours, cells were 
monitored for GFP level on a FACStar Plus cell sorter. 

Sorting was performed if 90% of ribozyme library cells from an unsorted 
control sample were induced to produce GFP above background levels. Two cell 
fractions were collected in each round of sorting. 

25 In the initial sort the Ml gate collected cells in luminescence channels 1 to 4.5; 

those cells with the lowest GFP signal (5% of the induced population). The M2 sort 
gate collected cells in luminescence channels 4.5 to 20; cells with low GFP signal 
(10% of the induced population). The Ml and M2 fractions together represented the 
15% of the induced population responding least to the GFP induction treatment, hi 

30 order to assure that the diversity of the ribozyme library was represented 2.3 X 10 6 
cells were collected in the Ml fraction and 4.6 x 10 6 cells were collected in the M2 
fraction. The Ml and M2 fractions wee cultured separately and representative 
portions of each were cryopreserved after each round of sorting. 
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When treated with PCM and PMA prior to a second round of sorting, cells 
from both the Ml and M2 fractions responded as before with >90% of the cells 
producing elevated levels of GFP. The same sorting criteria and sort gates were 
used in the second round. As in the first round of sorting the Ml sort gate collected 
5 5% of the treated cells (those with little or no GFP) and the M2 gate collected 10% 
of the cells. Two more rounds of sorting were performed using the same sorting 
criteria. 

Prior to the third round of sorting the Ml fraction showed a three- fold 
enrichment of GFP negative cells. Prior to the fourth round of sorting both the Ml 
10 and M2 fractions were significantly enriched in cells unresponsive to the GFP 
induction treatment. 

Following the third round of sorting the Ml fraction was selected to generate a 
database of ribozymes present in the sorted cells. 

Example 5: Recovery of Ribozyme Sequence from Sorted Cells 

1 5 Genomic DNA was obtained from sorted ribozyme library cells by standard 

methods. Nested polymerase chain reaction (PCR) primers (Sequence ID Nos. 5468 
and 5469) that hybridized to the retroviral vector 5' and 3' of the ribozyme were 
used to recover and amplify the ribozyme sequences from the Clone 8 library cell 
DNA. The PCR product was ligated into a bacterial cloning vector. Two methods 

20 were developed to use the recovered ribozyme library, in plasmid form, to generate a 
database of ribozyme binding arm sequences, hi the first approach the library was 
cloned into E. coli. DNA was prepared by plasmid isolation from bacterial colonies 
or by direct colony PCR and ribozyme arm sequence was determined. Over 450 
sequences have been obtained by this method. A second method used the ribozyme 

25 library to transfect H292 Clone 8 cells. Clonal lines of stably transfected cells were 
established and induced with PCM and PMA. Those lines which failed to respond 
to GFP induction were probed by PCR for single ribozyme integration events. Over 
300 sequences were obtained in this manner. The unique ribozyme sequences 
obtained by both methods were added to a Target Sequence Tag (TST) database. 

30 Example 6: Bioinformatics 

After sequencing 760 recovered ribozymes 171 unique sequences were found. 
Of the unique sequences, 91 have been recovered once and 80 have been found 
multiple times. Most of the repeated sequences have been found 2 to 1 1 times. One 
sequence has been recovered 145 times. The diversity of the sequences obtained 
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indicates that the sorted cells are a promising source of information leading to target 
discovery. 

Ribozyme binding arm sequences were compared to public and private gene 
data banks. Gene matches were compiled according to perfect and imperfect 
matches. Potential gene targets were categorized by the number of different 
ribozyme sequences matching each gene. Multiple ribozyme matches have been 
found for 180 genes. Genes with more than one perfect ribozyme match were given 
close attention. A total of 34 genes have been verified to date to have multiple 
perfect ribozyme matches. Of those at least 17 have protein products of known 
function. 

Two perfect ribozyme matches were found for human calcium activated 
chloride channel-1 (hCLCAl). Each ribozyme matches at two sites in the hCLCAl 
gene. A third sorted library ribozyme sequence "hits" hCLCAl but has a single 
nucleotide mismatch. 

Example 7: Selection of hCLCAl for Validation 

The selection of hCLCAl as a candidate for target validation was based on 
bioinformatics and on emerging data in murine models of mucous hypersecretion in 
the trachea and lung. Two ribozymes (Seq. ID Nos. 2332 and 2273) recovered from 
cells that no longer respond to mucin promoter/GFP induction match perfectly to 
hCLCAl. A third has a single mismatch. Evidence from two murine models 
indicates a correlation between mucous hypersecretion in the lung and strong 
upregulation of gob-5 (GenBank AB017156), a murine homologue of hCLCAl. 

Example 8: Validation of hCLCAl 

To validate hCLCAl as a regulator of MUC5AC expression, GeneBloc 
reagents were designed (Table IX) to the hCLCAl cDNA sequence (GenBank 
AF039400). GeneBloc reagents are complexed with a cationic lipid formulation 
prior to administration to H292/MUC5AC/GFP Clone 8 cells. Concentrations of the 
GeneBloc reagents administered range from 30 nM to 120 nM at cationic lipid 
concentrations of 4-6 ng/mL. Cells are treated with GeneBloc reagents for 72 to 96 
hours. Before the termination of GeneBloc treatment, PCM (to 40 %) and PMA (to 
50 nM) are added to induce the MUC5AC promoter. After twenty hours of 
induction the cells are harvested and assayed for phenotypic and molecular 
parameters. Reduced GFP expression in GeneBloc treated cells (measured by flow 
cytometry) is taken as evidence for validation of hCLCAl. Knockdown of hCLCAl 
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RNA in GeneBloc treated cells can correlate with reduced endogenous MUC5AC 
RNA and reduced GFP RNA (from the MUC5AC/GFP construct) to complete 
validation ofhCLCAl. 

Example 9: Identification of Potential Target Sites in Human CLCA1 RNA 

5 The sequence of human CLCA1 is screened for accessible sites using a 

computer-folding algorithm. Regions of the RNA are identified that do not form 
secondary folding structures. These regions contain potential ribozyme and/or 
antisense binding/cleavage sites. The sequences of these binding/cleavage sites are 
shown in Tables III-IX. 

10 Example 10: Selection of Enzymatic Nucleic Acid Cleavage Sites in Human CLCA1 
RNA 

Ribozyme target sites are chosen by analyzing sequences of Human CLCA1 
(GenBank accession numbers: NM_001285 and AF039400) and prioritizing the sites 
on the basis of folding. Ribozymes are designed that could bind each target and are 

1 5 individually analyzed by computer folding (Christoffersen et al, 1994 J. Mol. Struc. 
Theochem, 311, 273; Jaeger et al, 1989, Proc. Natl. Acad. Sci. USA, 86, 7706) to 
assess whether the ribozyme sequences fold into the appropriate secondary structure. 
Those ribozymes with unfavorable intramolecular interactions between the binding 
arms and the catalytic core are eliminated from consideration. As noted below, 

20 varying binding arm lengths can be chosen to optimize activity. Generally, at least 5 
bases on each arm are able to bind to, or otherwise interact with, the target RNA. 

Example 1 1 : Chemical Synthesis and Purification of Ribozymes and Antisense for 
Efficient Cleavage and/or blocking of CLCA1 RNA 

Ribozymes and antisense constructs are designed to anneal to various sites in 
25 the RNA message. The binding arms of the ribozymes are complementary to the 
target site sequences described above, while the antisense constructs are fully 
complimentary to the target site sequences described above. The ribozymes and 
antisense constructs were chemically synthesized. The method of synthesis used 
followed the procedure for normal RNA synthesis as described above and in Usman 
30 et al, (1987 J. Am. Chem. Soc, 109, 7845), Scaringe et al, (1990 Nucleic Acids 
Res., 18, 5433) and Wincott et al, supra, and made use of common nucleic acid 
protecting and coupling groups, such as dimethoxytrityl at the 5'-end, and 
phosphoramidites at the 3 '-end. The average stepwise coupling yields were typically 
>98%. 
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Ribozymes and antisense constructs are also synthesized from DNA templates 
using bacteriophage T7 RNA polymerase (Milligan and Uhlenbeck, 1989, Methods 
Enzymol. 180, 51). Ribozymes and antisense constructs are purified by gel 
electrophoresis using general methods or are purified by high pressure liquid 
chromatography (HPLC; see Wincott et al, supra; the totality of which is hereby 
incorporated herein by reference) and are resuspended in water. The sequences of 
the chemically synthesized ribozymes and antisense constructs used in this study are 
shown below in Table III-IX. 

Indications 

Particular conditions and disease states that can be associated with CLCA1 
expression modulation include but are not limited to Chronic Obstructive Pulmonary 
Disease (COPD), chronic bronchitis, asthma, cystic fibrosis, obstructive bowel 
syndrome, and any other diseases or conditions that are related to or will respond to 
the levels of CLCA1 in a cell or tissue, alone or in combination with other therapies. 

The present body of knowledge in CLCA1 research indicates the need for 
methods to assay CLCA1 activity and for compounds that can regulate CLCA1 
expression for research, diagnostic, and therapeutic use. 

The nucleic acid molecules of the present invention may also be administered to 
a patient in combination with other therapeutic compounds to increase the overall 
therapeutic effect. The use of multiple compounds to treat an indication may 
increase the beneficial effects while reducing the presence of side effects. Oxygen 
therapy, bronchodilators, corticosteroids, antibacterials, vaccinations, acetylcysteine, 
mucokinetic agents, and DNase (Pulmozyme), are non-limiting examples of methods 
and/or treatments that can be used in combination with nucleic acid molecules of the 
invention. Those skilled in the art will recognize that other drug compounds and 
therapies can be similarly and readily combined with the nucleic acid molecules of 
the instant invention (e.g. ribozymes and antisense molecules) and are, therefore, 
within the scope of the instant invention. 

Cell Culture 



The cell culture system described in Example 8 can be used to evaluate nucleic 
acid molecules of the invention for efficacy in CLCA1 and mucin modulation. 
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Animal Models 

Numerous reports can be found which describe animal models relevant to 
disease states such as COPD and cystic fibrosis. These models can be used to 
determine efficacy of the nucleic acid molecules of the instant invention targeting 
5 such disease states or conditions. Animal models for chronic pulmonary disease 
(COPD) are described by Shapiro, 2000, Am. J. Respir. Cell Mol. Biol, 22(1), 4-7; 
Hogg, 1998, Ika Daigaku Zasshi, 56(3), 429-432; and Garssen et al, 1997, 
Inhalation Toxicol, 9(6), 581-599. Animal models for cystic fibrosis are described 
by Kent et al, 1997, J. Clin. Invest, 100(12), 3060-3069; Hill et al, 1997, 62(1), 
10 113-122; Grubb and Gabriel, 1997, Am. J. Physiol, 272, G258-G266; Rozmahel, 
1996, From: Diss. Abstr. Int. B 1997, 57(8), 4863; Van Doorninck et al, 1995, 
EMBOJ., 14(18), 4403-11; and Zeiher et al., 1995,./ Clin. Invest., 96(4), 2051-64. 

Diagnostic uses 

The nucleic acid molecules of this invention (e.g., ribozymes) may be used as 

1 5 diagnostic tools to examine genetic drift and mutations within diseased cells or to 
detect the presence of CLCA1 RNA in a cell. The close relationship between 
ribozyme activity and the structure of the target RNA allows the detection of 
mutations in any region of the molecule which alters the base-pairing and three- 
dimensional structure of the target RNA. By using multiple ribozymes described in 

20 this invention, one may map nucleotide changes which are important to RNA 
structure and function in vitro, as well as in cells and tissues. Cleavage of target 
RNAs with ribozymes may be used to inhibit gene expression and define the role 
(essentially) of specified gene products in the progression of disease. In this manner, 
other genetic targets may be defined as important mediators of the disease. These 

25 experiments will lead to better treatment of the disease progression by affording the 
possibility of combinational therapies (e.g., multiple ribozymes targeted to different 
genes, ribozymes coupled with known small molecule inhibitors, or intermittent 
treatment with combinations of ribozymes and/or other chemical or biological 
molecules). Other in vitro uses of ribozymes of this invention are well known in the 

30 art, and include detection of the presence of mRNAs associated with CLCA1 -related 
condition. Such RNA is detected by determining the presence of a cleavage product 
after treatment with a ribozyme using standard methodology. 

In a specific example, ribozymes which can cleave only wild-type or mutant 
forms of the target RNA are used for the assay. The first ribozyme is used to 
35 identify wild-type RNA present in the sample and the second ribozyme will be used 
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to identify mutant RNA in the sample. As reaction controls, synthetic substrates of 
both wild-type and mutant RNA will be cleaved by both ribozymes to demonstrate 
the relative ribozyme efficiencies in the reactions and the absence of cleavage of the 
"non-targeted" RNA species. The cleavage products from the synthetic substrates 
5 will also serve to generate size markers for the analysis of wild-type and mutant 
RNAs in the sample population. Thus, each analysis can require two ribozymes, two 
substrates and one unknown sample, which will be combined into six reactions. The 
presence of cleavage products will be determined using an RNAse protection assay 
so that full-length and cleavage fragments of each RNA can be analyzed in one lane 

10 of a polyacrylamide gel. It is not absolutely required to quantify the results to gain 
insight into the expression of mutant RNAs and putative risk of the desired 
phenotypic changes in target cells. The expression of rnRNA whose protein product 
is implicated in the development of the phenotype {i.e., CLCA1) is adequate to 
establish risk. If probes of comparable specific activity are used for both transcripts, 

1 5 then a qualitative comparison of RNA levels will be adequate and will decrease the 
cost of the initial diagnosis. Higher mutant form to wild-type ratios will be 
correlated with higher risk whether RNA levels are compared qualitatively or 
quantitatively. 

Additional Uses 

20 Potential usefulness of sequence-specific enzymatic nucleic acid molecules of 

the instant invention might have many of the same applications for the study of RNA 
that DNA restriction endonucleases have for the study of DNA (Nathans et al., 1975 
Ann. Rev. Biochem. 44:273). For example, the pattern of restriction fragments could 
be used to establish sequence relationships between two related RNAs, and large 

25 RNAs could be specifically cleaved to fragments of a size more useful for study. The 
ability to engineer sequence specificity of the enzymatic nucleic acid molecule is 
ideal for cleavage of RNAs of unknown sequence. Applicant describes the use of 
nucleic acid molecules to down-regulate gene expression of target genes in bacterial, 
microbial, fungal, viral, and eukaryotic systems including plant, or mammalian cells. 

30 All patents and publications mentioned in the specification are indicative of the 

. levels of skill of those skilled in the art to which the invention pertains. All 
references cited in this disclosure are incorporated by reference to the same extent as 
if each reference had been incorporated by reference in its entirety individually. 

One skilled in the art would readily appreciate that the present invention is well 
35 adapted to carry out the objects and obtain the ends and advantages mentioned, as 
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well as those inherent therein. The methods and compositions described herein as 
presently representative of preferred embodiments are exemplary and are not 
intended as limitations on the scope of the invention. Changes therein and other 
uses will occur to those skilled in the art, which are encompassed within the spirit of 
5 the invention, are defined by the scope of the claims. 

It will be readily apparent to one skilled in the art that varying substitutions and 
modifications may be made to the invention disclosed herein without departing from 
the scope and spirit of the invention. Thus, such additional embodiments are within 
the scope of the present invention and the following claims. 

1 0 The invention illustratively described herein suitably may be practiced in the 

absence of any element or elements, limitation or limitations which is not 
specifically disclosed herein. Thus, for example, in each instance herein any of the 
terms "comprising", "consisting essentially of and "consisting of may be replaced 
with either of the other two terms. The terms and expressions which have been 

1 5 employed are used as terms of description and not of limitation, and there is no 
intention that in the use of such terms and expressions of excluding any equivalents 
of the features shown and described or portions thereof, but it is recognized that 
various modifications are possible within the scope of the invention claimed. Thus, 
it should be understood that although the present invention has been specifically 

20 disclosed by preferred embodiments, optional features, modification and variation of 
the concepts herein disclosed may be resorted to by those skilled in the art, and that 
such modifications and variations are considered to be within the scope of this 
invention as defined by the description and the appended claims. 

In addition, where features or aspects of the invention are described in terms of 
25 Markush groups or other grouping of alternatives, those skilled in the art will 
recognize that the invention is also thereby described in terms of any individual 
member or subgroup of members of the Markush group or other group. 

Other embodiments are within the following claims. 
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Characteristics of naturally occurring ribozymes 
Group I Introns 

• Size: -150 to >1000 nucleotides. 

Requires a U in the target sequence immediately 5' of the cleavage site. 
Binds 4-6 nucleotides at the 5-side of the cleavage site. 

Reaction mechanism: attack by the 3'-OH of guanosine to generate cleavage 
products with 3'-OH and 5'-guanosine. 

• Additional protein cofactors required in some cases to help folding and 
maintainance of the active structure. 

Over 300 known members of this class. Found as an intervening sequence in 
Tetrahymena thermophila rRNA, fungal mitochondria, chloroplasts, phage T4, 
blue-green algae, and others. 

• Major structural features largely established through phylogenetic 
comparisons, mutagenesis, and biochemical studies [y 1 ]. 

• Complete kinetic framework established for one ribozyme p/V/ 1 ]- 
Studies of ribozyme folding and substrate docking underway [ vfl / viii / ix ]. 

• Chemical modification investigation of important residues well established 

• The small (4-6 nt) binding site may make this ribozyme too non-specific for 
targeted RNA cleavage, however, the Tetrahymena group I intron has been 
used to repair a "defective" D-galactosidase message by the ligation of new □- 
galactosidase sequences onto the defective message [ xii ]. 

RNAse P RNA (Ml RNA) 

• Size: -290 to 400 nucleotides. 

• RNA portion of a ubiquitous ribonucleoprotein enzyme. 

• Cleaves tRNA precursors to form mature tRNA [ xiii ]. 

Reaction mechanism: possible attack by M -OH to generate cleavage products 
with 3' -OH and 5'-phosphate. 

• RNAse P is found throughout the prokaryotes and eukaryotes. The RNA 
subunit has been sequenced from bacteria, yeast, rodents, and primates. 

• Recruitment of endogenous RNAse P for therapeutic applications is possible 
through hybridization of an External Guide Sequence (EGS) to the target RNA 

• Important phosphate and 2' OH contacts recently identified [xv^xvu] 
Group II Introns 

• Size: >1000 nucleotides. 

• Trans cleavage of target RNAs recently demonstrated [x^i/i*]. 
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• Sequence requirements not fully determined. 

• Reaction mechanism: 2'-OH of an internal adenosine generates cleavage 
products with 3' -OH and a "lariat" RNA containing a 3'-5' and a 2' -5' branch 
point. 

• Only natural ribozyme with demonstrated participation in DN A cleavage 
[xx^xxi] m addition to RNA cleavage and ligation. 

• Major structural features largely established through phylogenetic comparisons 

[xxii]. 

• Important 2' OH contacts beginning to be identified [ xxm ] 

• Kinetic framework under development [ xxiv ] 

Neurospora VS RNA 

• Size: -144 nucleotides. 

Trans cleavage of hairpin target RN As recently demonstrated [ xxv ] . 

• Sequence requirements not fully determined. 

• Reaction mechanism: attack by 2' -OH 5' to the scissile bond to generate 
cleavage products with 2',3'-cyclic phosphate and 5'-OH ends. 

• Binding sites and structural requirements not fully determined. 

• Only 1 known member of this class. Found in Neurospora VS RNA. 

Hammerhead Ribozyme 

(see text for references) 

• Size: -13 to 40 nucleotides. 

• Requires the target sequence UH immediately 5 ! of the cleavage site. 

• Binds a variable number nucleotides on both sides of the cleavage site. 

• Reaction mechanism: attack by 2'-OH 5' to the scissile bond to generate 
cleavage products with 2',3'-cyclic phosphate and 5'-OH ends. 

• 14 known members of this class. Found in a number of plant pathogens 
(virusoids) that use RNA as the infectious agent. 

• Essential structural features largely defined, including 2 crystal structures 

• Minimal ligation activity demonstrated (for engineering through in vitro 
selection) [ xxviii ] 

• Complete kinetic framework established for two or more ribozymes [ xxix ]. 

• Chemical modification investigation of important residues well established 

[XXX]. 

Hairpin Ribozyme 

• Size: -50 nucleotides. 

• Requires the target sequence GUC immediately 3' of the cleavage site. 

• Binds 4-6 nucleotides at the 5'-side of the cleavage site and a variable number to 
the 3'-side of the cleavage site. 

• Reaction mechanism: attack by 2'-OH 5' to the scissile bond to generate 
cleavage products with 2' / 3 , -cyclic phosphate and 5'-OH ends. 
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3 known members of this class. Found in three plant pathogen (satellite RNAs 
of the tobacco ringspot virus, arabis mosaic virus and chicory yellow mottle 
virus) which uses RNA as the infectious agent. 

• Essential structural features largely defined [xxxi^xxxi^xxxii^xxxiv] 

• Ligation activity (in addition to cleavage activity) makes ribozyme amenable to 
engineering through in vitro selection [ xxxv ] 

Complete kinetic framework established for one ribozyme [xxxvi] 

Chemical modification investigation of important residues begun [xxxvuxxxviii] 

Hepatitis Delta Virus (HDV) Ribozyme 

• Size: ~60 nucleotides. 

• Trans cleavage of target RNAs demonstrated [ xx *i*]. 

Binding sites and structural requirements not fully determined, although no 
sequences 5' of cleavage site are required. Folded ribozyme contains a 
pseudoknot structure [ x1 ]. 

Reaction mechanism: attack by 2' -OH 5' to the scissile bond to generate 
cleavage products with 2',3' -cyclic phosphate and 5' -OH ends. 

• Only 2 known members of this class. Found in human HDV. 

• Circular form of HDV is active and shows increased nuclease stability [ xli ] 
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Table III: Human CLCA1 Hammerhead Ribozyme and Target Sequence 249.021 



Pos 


Substrate 


SeqID 
No. 


Ribozyme 


Rz 
SeqID 
No. 


11 


CUAAUGCU U UUGGUACA 


1 


UGUACCAA CUGAUGAG GCCGUUAGGC CGAA AGCAUUAG 


2190 


12 


UAAUGCUU U UGGUACAA 


2 


UUGUACCA CUGAUGAG GCCGUUAGGC CGAA AAGCAUUA 


2191 


13 


AAUGCUUU U GGUACAAA 


3 


UUUGUACC CUGAUGAG GCCGUUAGGC CGAA AAAGCAUU 


2192 


17 


CUUUUGGU A CAAAUGGA 


4 


UCCAUUUG CUGAUGAG GCCGUUAGGC CGAA ACCAAAAG 


2193 


34 


UGUGGAAU A UAAUUGAA 


5 


UUCAAUUA CUGAUGAG GCCGUUAGGC CGAA AUUCCACA 


2194 


36 


UGGAAUAU A AUUGAAUA 


6 


UAUUCAAU CUGAUGAG GCCGUUAGGC CGAA AUAUUCCA 


2195 


39 


AAUAUAAU U GAAUAUUU 


7 


AAAUAUUC CUGAUGAG GCCGUUAGGC CGAA AUUAUAUU 


2196 


44 


AAUUGAAU A UUUUCUUG 


8 


CAAGAAAA CUGAUGAG GCCGUUAGGC CGAA AUUCAAUU 


2197 


46 


UUGAAUAU U UUCUUGUU 


9 


AACAAGAA CUGAUGAG GCCGUUAGGC CGAA AUAUUCAA 


2198 


47 


UGAAUAUU U UCUUGUUU 


10 


AAACAAGA CUGAUGAG GCCGUUAGGC CGAA AAUAUUCA 


2199 


48 


GAAUAUUU U CUUGUUUA 


11 


UAAACAAG CUGAUGAG GCCGUUAGGC CGAA AAAUAUUC 


2200 


49 


AAUAUUUU C UUGUUUAA 


12 


UUAAACAA CUGAUGAG GCCGUUAGGC CGAA AAAAUAUU 


2201 


51 


UAUUUUCU U GUUUAAGG 


13 


CCUUAAAC CUGAUGAG GCCGUUAGGC CGAA AGAAAAUA 


2202 


54 


UUUCUUGU U UAAGGGGA 


14 


UCCCCUUA CUGAUGAG GCCGUUAGGC CGAA ACAAGAAA 


2203 


55 


UUCUUGUU U AAGGGGAG 


15 


CUCCCCUU CUGAUGAG GCCGUUAGGC CGAA AACAAGAA 


2204 


56 


UCUUGUUU A AGGGGAGC 


16 


GCUCCCCU CUGAUGAG GCCGUUAGGC CGAA AAACAAGA 


2205 


77 


AGAGGUGU U GAGGUUAU 


17 


AUAACCUC CUGAUGAG GCCGUUAGGC CGAA ACACCUCU 


2206 


83 


GUUGAGGU U AUGUCAAG 


18 


CUUGACAU CUGAUGAG GCCGUUAGGC CGAA ACCUCAAC 


2207 


84 


UUGAGGUU A UGUCAAGC 


19 


GCUUGACA CUGAUGAG GCCGUUAGGC CGAA AACCUCAA 


2208 


88 


GGUUAUGU C AAGCAUCU 


20 


AGAUGCUU CUGAUGAG GCCGUUAGGC CGAA ACAUAACC 


2209 


95 


UCAAGCAU C UGGCACAG 


21 


CUGUGCCA CUGAUGAG GCCGUUAGGC CGAA AUGCUUGA 


2210 


122 


AUGGAAAU A UUUACAAG 


22 


CUUGUAAA CUGAUGAG GCCGUUAGGC CGAA AUUUCCAU 


2211 


124 


GGAAAUAU U UACAAGUA 


23 


UACUUGUA CUGAUGAG GCCGUUAGGC CGAA AUAUUUCC 


2212 


125 


GAAAUAUU U ACAAGUAC 


24 


GUACUUGU CUGAUGAG GCCGUUAGGC CGAA AAUAUUUC 


2213 


126 


AAAUAUUU A CAAGUACG 


25 


CGUACUUG CUGAUGAG GCCGUUAGGC CGAA AAAUAUUU 


2214 


132 


UUACAAGU A CGCAAUUU 


26 


AAAUUGCG CUGAUGAG GCCGUUAGGC CGAA ACUUGUAA 


2215 


139 


UACGCAAU U UGAGACUA 


27 


UAGUCUCA CUGAUGAG GCCGUUAGGC CGAA AUUGCGUA 


2216 


140 


ACGCAAUU U GAGACUAA 


28 


UUAGUCUC CUGAUGAG GCCGUUAGGC CGAA AAUUGCGU 


2217 


147 


UUGAGACU A AGAUAUUG 


29 


CAAUAUCU CUGAUGAG GCCGUUAGGC CGAA AGUCUCAA 


2218 


152 


ACUAAGAU A UUGUUAUC 


30 


GAUAACAA CUGAUGAG GCCGUUAGGC CGAA AUCUUAGU 


2219 


154 


UAAGAUAU U GUUAUCAU 


31 


AUGAUAAC CUGAUGAG GCCGUUAGGC CGAA AUAUCUUA 


2220 


157 


GAUAUUGU U AUCAUUCU 


32 


AGAAUGAU CUGAUGAG GCCGUUAGGC CGAA ACAAUAUC 


2221 


158 


AUAUUGUU A UCAUUCUC 


33 


GAGAAUGA CUGAUGAG GCCGUUAGGC CGAA AACAAUAU 


2222 


160 


AUUGUUAU C AUUCUCCU 


34 


AGGAGAAU CUGAUGAG GCCGUUAGGC CGAA AUAACAAU 


2223 


163 


GUUAUCAU U CUCCUAUU 


35 


AAUAGGAG CUGAUGAG GCCGUUAGGC CGAA AUGAUAAC 


2224 


164 


UUAUCAUU C UCCUAUUG 


36 


CAAUAGGA CUGAUGAG GCCGUUAGGC CGAA AAUGAUAA 


2225 


166 


AUCAUUCU C CUAUUGAA 


37 


UUCAAUAG CUGAUGAG GCCGUUAGGC CGAA AGAAUGAU 


2226 


169 


AUUCUCCU A UUGAAGAC 


38 


GUCUUCAA CUGAUGAG GCCGUUAGGC CGAA AGGAGAAU 


2227 


171 


UCUCCUAU U GAAGACAA 


39 


UUGUCUUC CUGAUGAG GCCGUUAGGC CGAA AUAGGAGA 


2228 


187 


AGAGCAAU A GUAAAACA 


40 


UGUUUUAC CUGAUGAG GCCGUUAGGC CGAA AUUGCUCU 


2229 


190 


GCAAUAGU A AAACACAU 


41 


AUGUGUUU CUGAUGAG GCCGUUAGGC CGAA ACUAUUGC 


2230 


199 


AAACACAU C AGGUCAGG 


42 


CCUGACCU CUGAUGAG GCCGUUAGGC CGAA AUGUGUUU 


2231 


204 


CAUCAGGU C AGGGGGUU 


43 


AACCCCCU CUGAUGAG GCCGUUAGGC CGAA ACCUGAUG 


2232 


212 


CAGGGGGU U AAAGACCU 


44 


AGGUCUUU CUGAUGAG GCCGUUAGGC CGAA ACCCCCUG 


2233 


213 


AGGGGGUU A AAGACCUG 


45 


CAGGUCUU CUGAUGAG GCCGUUAGGC CGAA AACCCCCU 


2234 


226 


CCUGUGAU A AACCACUU 


46 


AAGUGGUU CUGAUGAG GCCGUUAGGC CGAA AUCACAGG 


2235 


234 


AAACCACU U CCGAUAAG 


47 


CUUAUCGG CUGAUGAG GCCGUUAGGC CGAA AGUGGUUU 


2236 


235 


AACCACUU C CGAUAAGU 


48 


ACUUAUCG CUGAUGAG GCCGUUAGGC CGAA AAGUGGUU 


2237 


240 


CUUCCGAU A AGUUGGAA 


49 


UUCCAACU CUGAUGAG GCCGUUAGGC CGAA AUCGGAAG 


2238 


244 


CGAUAAGU U GGAAACGU 


50 


ACGUUUCC CUGAUGAG GCCGUUAGGC CGAA ACUUAUCG 


2239 


257 


ACGUGUGU C UAUAUUUU 


51 


AAAAUAUA CUGAUGAG GCCGUUAGGC CGAA ACACACGU 


2240 


259 


GUGUGUCU A UAUUUUCA 


52 


UGAAAAUA CUGAUGAG GCCGUUAGGC CGAA AGACACAC 


2241 



261 
263 



GUGUCUAU A UUUUCAUA 



UAUGAAAA CUGAUGAG GCCGUUAGGC CGAA AUAGACAC 



GUCUAUAU U UUCAUAUC 



GAUAUGAA CUGAUGAG GCCGUUAGGC CGAA AUAUAGAC 



264 
265 
266 
269 
271 
275 
277 
279 
281 
283 
289 
303 
304 
307 
316 
317 



UCUAUAUU U UCAUAUCU 



AGAUAUGA CUGAUGAG GCCGUUAGGC CGAA AAUAUAGA 



CUAUAUUU U CAUAUCUG 



CAGAUAUG CUGAUGAG GCCGUUAGGC CGAA AAAUAUAG 



UAUAUUUU C AUAUCUGU 



ACAGAUAU CUGAUGAG GCCGUUAGGC CGAA AAAAUAUA 



AUUUUCAU A UCUGUAUA 



UAUACAGA CUGAUGAG GCCGUUAGGC CGAA AUGAAAAU 



UUUCAUAU C UGUAUAUA 



UAUAUACA CUGAUGAG GCCGUUAGGC CGAA AUAUGAAA 



AUAUCUGU A UAUAUAUA 



UAUAUAUA CUGAUGAG GCCGUUAGGC CGAA ACAGAUAU 



AUCUGUAU A UAUAUAAU 



AUUAUAUA CUGAUGAG GCCGUUAGGC CGAA AUACAGAU 



CUGUAUAU A UAUAAUGG 



CCAUUAUA CUGAUGAG GCCGUUAGGC CGAA AUAUACAG 



GUAUAUAU A UAAUGGUA 



UACCAUUA CUGAUGAG GCCGUUAGGC CGAA AUAUAUAC 



AUAUAUAU A AUGGUAAA 



UUUACCAU CUGAUGAG GCCGUUAGGC CGAA AUAUAUAU 



AUAAUGGU A AAGAAAGA 



UCUUUCUU CUGAUGAG GCCGUUAGGC CGAA ACCAUUAU 



AGACACCU U CGUAACCC 



GGGUUACG CUGAUGAG GCCGUUAGGC CGAA AGGUGUCU 



GACACCUU C GUAACCCG 



CGGGUUAC CUGAUGAG GCCGUUAGGC CGAA AAGGUGUC 



ACCUUCGU A ACCCGCAU 



AUGCGGGU CUGAUGAG GCCGUUAGGC CGAA ACGAAGGU 



ACCCGCAU U UUCCAAAG 



CUUUGGAA CUGAUGAG GCCGUUAGGC CGAA AUGCGGGU 



CCCGCAUU U UCCAAAGA 



UCUUUGGA CUGAUGAG GCCGUUAGGC CGAA AAUGCGGG 



CCGCAUUU U CCAAAGAG 



CUCUUUGG CUGAUGAG GCCGUUAGGC CGAA AAAUGCGG 



319 
333 
346 
362 
363 
3 64 
370 
371 
377 
378 
381 



CGCAUUUU C CAAAGAGA 



UCUCUUUG CUGAUGAG GCCGUUAGGC CGAA AAAAUGCG 



AGAGGAAU C ACAGGGAG 



CUCCCUGU CUGAUGAG GCCGUUAGGC CGAA AUUCCUCU 



GGAGAUGU A CAGCAAUG 



CAUUGCUG CUGAUGAG GCCGUUAGGC CGAA ACAUCUCC 



GGGGCCAU U UAAGAGUU 



AACUCUUA CUGAUGAG GCCGUUAGGC CGAA AUGGCCCC 



GGGCCAUU U AAGAGUUC 



GAACUCUU CUGAUGAG GCCGUUAGGC CGAA AAUGGCCC 



GGCCAUUU A AGAGUUCU 



AGAACUCU CUGAUGAG GCCGUUAGGC CGAA AAAUGGCC 



UUAAGAGU U CUGUGUUC 



GAACACAG CUGAUGAG GCCGUUAGGC CGAA ACUCUUAA 



UAAGAGUU C UGUGUUCA 



UGAACACA CUGAUGAG GCCGUUAGGC CGAA AACUCUUA 



UUCUGUGU U CAUCUUGA 



UCAAGAUG CUGAUGAG GCCGUUAGGC CGAA ACACAGAA 



UCUGUGUU C AUCUUGAU 



AUCAAGAU CUGAUGAG GCCGUUAGGC CGAA AACACAGA 



GUGUUCAU C UUGAUUCU 



AGAAUCAA CUGAUGAG GCCGUUAGGC CGAA AUGAACAC 



GUUCAUCU U GAUUCUUC 



GAAGAAUC CUGAUGAG GCCGUUAGGC CGAA AGAUGAAC 



387 
388 
390 
391 
396 
397 
399 
415 
418 
419 
423 
426 
427 
446 
456 



AUCUUGAU U CUUCACCU 



AGGUGAAG CUGAUGAG GCCGUUAGGC CGAA AUCAAGAU 



UCUUGAUU C UUCACCUU 



AAGGUGAA CUGAUGAG GCCGUUAGGC CGAA AAUCAAGA 



UUGAUUCU U CACCUUCU 



AGAAGGUG CUGAUGAG GCCGUUAGGC CGAA AGAAUCAA 



UGAUUCUU C ACCUUCUA 



UAGAAGGU CUGAUGAG GCCGUUAGGC CGAA AAGAAUCA 



CUUCACCU U CUAGAAGG 



CCUUCUAG CUGAUGAG GCCGUUAGGC CGAA AGGUGAAG 



UUCACCUU C UAGAAGGG 



CCCUUCUA CUGAUGAG GCCGUUAGGC CGAA AAGGUGAA 



CACCUUCU A GAAGGGGC 



GCCCCUUC CUGAUGAG GCCGUUAGGC CGAA AGAAGGUG 



CCCUGAGU A AUUCACUC 



GAGUGAAU CUGAUGAG GCCGUUAGGC CGAA ACUCAGGG 



UGAGUAAU U CACUCAUU 



AAUGAGUG CUGAUGAG GCCGUUAGGC CGAA AUUACUCA 



GAGUAAUU C ACUCAUUC 



GAAUGAGU CUGAUGAG GCCGUUAGGC CGAA AAUUACUC 



AAUUCACU C AUUCAGCU 



AGCUGAAU CUGAUGAG GCCGUUAGGC CGAA AGUGAAUU 



UCACUCAU U CAGCUGAA 



UUCAGCUG CUGAUGAG GCCGUUAGGC CGAA AUGAGUGA 



CACUCAUU C AGCUGAAC 



GUUCAGCU CUGAUGAG GCCGUUAGGC CGAA AAUGAGUG 



CAAUGGCU A UGAAGGCA 



UGCCUUCA CUGAUGAG GCCGUUAGGC CGAA AGCCAUUG 



GAAGGCAU U GUCGUUGC 



GCAACGAC CUGAUGAG GCCGUUAGGC CGAA AUGCCUUC 



462 
468 
498 
501 
502 
510 
533 
535 
539 



GGCAUUGU C GUUGCAAU 



AUUGCAAC CUGAUGAG GCCGUUAGGC CGAA ACAAUGCC 



AUUGUCGU U GCAAUCGA 



UCGAUUGC CUGAUGAG GCCGUUAGGC CGAA ACGACAAU 



GUUGCAAU C GACCCCAA 



UUGGGGUC CUGAUGAG GCCGUUAGGC CGAA AUUGCAAC 



GAAACACU C AUUCAACA 



UGUUGAAU CUGAUGAG GCCGUUAGGC CGAA AGUGUUUC 



ACACUCAU U CAACAAAU 



AUUUGUUG CUGAUGAG GCCGUUAGGC CGAA AUGAGUGU 



CACUCAUU C AACAAAUA 



UAUUUGUU CUGAUGAG GCCGUUAGGC CGAA AAUGAGUG 



CAACAAAU A AAGGACAU 



AUGUCCUU CUGAUGAG GCCGUUAGGC CGAA AUUUGUUG 



CCAGGCAU C UCUGUAUC 



GAUACAGA CUGAUGAG GCCGUUAGGC CGAA AUGCCUGG 



AGGCAUCU C UGUAUCUG 



CAGAUACA CUGAUGAG GCCGUUAGGC CGAA AGAUGCCU 



AUCUCUGU A UCUGUUUG 



CAAACAGA CUGAUGAG GCCGUUAGGC CGAA ACAGAGAU 



647 
663 
664 
669 
677 
679 
682 
685 
691 
704 
747 
753 
757 
763 
764 
765 
768 
782 
783 
791 
805 
812 
813 
816 
829 
832 
834 



849 
857 
862 
875 
876 



CUCUGUAU C UGUUUGAA 
" GUAUCUGU U UGAAGCUA 
" UAUCUGUU U GAAGCUAC 
" UUGAAGCU A CAGGAAAG 
" AAAGCGAU U UUAUUUCA 
" AAGCGAUU U UAUUUCAA 
" AGCGAUUU U AUUUCAAA 
" GCGAUUUU A UUUCAAAA 
" GAUUUUAU U UCAAAAAU 
" AUUUUAUU U CAAAAAUG 
' UUUUAUUU C AAAAAUGU 
" AAAAAUGU U GCCAUUUU 
" GUUGCCAU U UUGAUUCC 
' UUGCCAUU U UGAUUCCU 
" UGCCAUUU U GAUUCCUG 
" AUUUUGAU U CCUGAAAC 
UUUUGAUU C CUGAAACA 
" GGCUGACU A UGUGAGAC 
CCAAAACU U GAGACCUA 
" UGAGACCU A CAAAAAUG 
GCUGAUGU U CUGGUUGC 
CUGAUGUU C UGGUUGCU 
" GUUCUGGU U GCUGAGUC 
~ UGCUGAGU C UACUCCUC 
~ CUGAGUCU A CUCCUCCA 
AGUCUACU C CUCCAGGU 
~ CUACUCCU C CAGGUAAU 
CUCCAGGU A AUGAUGAA 
~ UGAACCCU A CACUGAGC 
~ GAAAGGAU C CACCUCAC 
~ AUCCACCU C ACUCCUGA 
ACCUCACU C CUGAUUUC 
CUCCUGAU U UCAUUGCA 
UCCUGAUU U CAUUGCAG 
CCUGAUUU C AUUGCAGG 
GAUUUCAU U GCAGGAAA 
AAAAAAGU U AGCUGAAU 
AAAAAGUU A GCUGAAUA 
AGCUGAAU A UGGACCAC 
CACAAGGU A AGGCAUUU 
UAAGGCAU U UGUCCAUG " 
AAGGCAUU U GUCCAUGA 
GCAUUUGU C CAUGAGUG 
AGUGGGCU C AUCUACGA 
GGGCUCAU C UACGAUGG 
GCUCAUCU A CGAUGGGG 
UGGGGAGU A UUUGACGA 
" GGGAGUAU U UGACGAGU 
GGAGUAUU U GACGAGUA 
UGACGAGU A CAAUAAUG 
AGUACAAU A AUGAUGAG 
UGAGAAAU U CUACUUAU 
GAGAAAUU C UACUUAUC " 
GAAAUUCU A CUUAUCCA " 
" AUUCUACU U AUCCAAUG ~ 
UUCUACUU A UCCAAUGG " 



UUCAAACA CUGAUGAG GCCGUUAGGC CGAA AUACAGAG 
UAGCUUCA CUGAUGAG GCCGUUAGGC CGAA ACAGAUAC 
G UAGCUUC CUGAUGAG GCCGUUAGGC CGAA AACAGAUA 
CUUUCCUG CUGAUGAG GCCGUUAGGC CGAA AGCUUCAA 
UGAAAUAA CUGAUGAG GCCGUUAGGC CGAA AUCGCUUU 
~ UUGAAAUA CUGAUGAG GCCGUUAGGC CGAA AAUCGCUU 
"" UUUGAAAU CUGAUGAG GCCGUUAGGC CGAA AAAUCGCU 
" UUUUGAAA CUGAUGAG GCCGUUAGGC CGAA AAAAUCGC 
" aUUUUUGA CUGAUGAG GCCGUUAGGC CGAA AUAAAAUC 
~ CAUUUUUG CUGAUGAG GCCGUUAGGC CGAA AAUAAAAU 
" aCAUUUUU CUGAUGAG GCCGUUAGGC CGAA AAAUAAAA 
~ AAAAUGGC CUGAUGAG GCCGUUAGGC CGAA ACAUUUUU 
~ GGAAUCAA CUGAUGAG GCCGUUAGGC CGAA AUGGCAAC 
~ AGGAAUCA CUGAUGAG GCCGUUAGGC CGAA AAUGGCAA 
~ CAGGAAUC CUGAUGAG GCCGUUAGGC CGAA AAAUGGCA 
~ GUUUCAGG CUGAUGAG GCCGUUAGGC CGAA AUCAAAAU 
~ UGUUUCAG CUGAUGAG GCCGUUAGGC CGAA AAUCAAAAl 2 314 
~ GUCUCACA CUGAUGAG GCCGUUAGGC CGAA AGUCAGCC 2315 
~ U AGGUCUC CUGAUGAG GCCGUUAGGC CGAA AGUUUUGG 
~ CAUUUUUG CUGAUGAG GCCGUUAGGC CGAA AGGUCUCA 
~ GCAACCAG CUGAUGAG GCCGUUAGGC CGAA ACAUCAGC 
~ AGCAACCA CUGAUGAG GCCGUUAGGC CGAA AACAUCAG 
~ GACUCAGC CUGAUGAG GCCGUUAGGC CGAA ACCAGAAC 
~ GAGGAGUA CUGAUGAG GCCGUUAGGC CGAA ACUCAGCA 
~ UGGAGGAG CUGAUGAG GCCGUUAGGC CGAA AGACUCAG 
~ ACCUGGAG CUGAUGAG GCCGUUAGGC CGAA AGUAGACU 
~ AUUACCUG CUGAUGAG GCCGUUAGGC CGAA AGGAGUAG 
~ UUCAUCAU CUGAUGAG GCCGUUAGGC CGAA ACCUGGAG 
~ GCUCAGUG CUGAUGAG GCCGUUAGGC CGAA AGGGUUCA I 2 3 2 6 
~ GUGAGGUG CUGAUGAG GCCGUUAGGC CGAA AUCCUUUcl 232 7 
~ UCAGGAGU CUGAUGAG GCCGUUAGGC CGAA AGGUGGAU 
~ GAAAUCAG CUGAUGAG GCCGUUAGGC CGAA AGUGAGGul ; 
~ UGCAAUGA CUGAUGAG GCCGUUAGGC CGAA AUCAGGAG ; 
~ CUGCAAUG CUGAUGAG GCCGUUAGGC CGAA AAUCAGGA ; 
~ CCUGCAAU CUGAUGAG GCCGUUAGGC CGAA AAAUCAGG S 
~ UUUCCUGC CUGAUGAG GCCGUUAGGC CGAA AUGAAAUC : 
~ AUUCAGCU CUGAUGAG GCCGUUAGGC CGAA ACUUUUUU : 
~ UAUUCAGC CUGAUGAG GCCGUUAGGC CGAA AACUUUUU : 
~ GUGGUCCA CUGAUGAG GCCGUUAGGC CGAA AUUCAGCU ; 
~ AAAUGCCU CUGAUGAG GCCGUUAGGC CGAA ACCUUGUG ; 
~ CAUGGACA CUGAUGAG GCCGUUAGGC CGAA AUGCCUUA : 
UCAUGGAC CUGAUGAG GCCGUUAGGC CGAA AAUGCCUU : 
~ CACUCAUG CUGAUGAG GCCGUUAGGC CGAA ACAAAUGC : 
~~ UCGUAGAU CUGAUGAG GCCGUUAGGC CGAA AGCCCACU : 
~ CCAUCGUA CUGAUGAG GCCGUUAGGC CGAA AUGAGCCC : 
~ CCCCAUCG CUGAUGAG GCCGUUAGGC CGAA AGAUGAGC : 
UCGUCAAA CUGAUGAG GCCGUUAGGC CGAA ACUCCCCA : 
: ACUCGUCA CUGAUGAG GCCGUUAGGC CGAA AUACUCCC : 
UACUCGUC CUGAUGAG GCCGUUAGGC CGAA AAUACUCC : 
~ CAUUAUUG CUGAUGAG GCCGUUAGGC CGAA ACUCGUCA : 
~ CUCAUCAU CUGAUGAG GCCGUUAGGC CGAA AUUGUACU : 
> AUAAGUAG CUGAUGAG GCCGUUAGGC CGAA AUUUCUCA : 
. GAUAAGUA CUGAUGAG GCCGUUAGGC CGAA AAUUUCUC : 
: UGGAUAAG CUGAUGAG GCCGUUAGGC CGAA AGAAUUUC : 
. CAUUGGAU CUGAUGAG GCCGUUAGGC CGAA AGUAGAAU ; 
_ CCAUUGGA CUGAUGAG GCCGUUAGGC CGAA AAGUAGAA 
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CUACUUAU C CAAUGGAA 



UUCCAUUG CUGAUGAG GCCGUUAGGC CGAA AUAAGUAG 



GGAAGAAU A CAAGCAGU 



ACUGCUUG CUGAUGAG GCCGUUAGGC CGAA AUUCUUCC 



CAAGCAGU A AGAUGUUC 



GAACAUCU CUGAUGAG GCCGUUAGGC CGAA ACUGCUUG 



UAAGAUGU U CAGCAGGU 



ACCUGCUG CUGAUGAG GCCGUUAGGC CGAA ACAUCUUA 



AAGAUGUU C AGCAGGUA 



UACCUGCU CUGAUGAG GCCGUUAGGC CGAA AACAUCUU 



CAGCAGGU A UUACUGGU 



ACCAGUAA CUGAUGAG GCCGUUAGGC CGAA ACCUGCUG 



GCAGGUAU U ACUGGUAC 



GUACCAGU CUGAUGAG GCCGUUAGGC CGAA AUACCUGC 



CAGGUAUU A CUGGUACA 



UGUACCAG CUGAUGAG GCCGUUAGGC CGAA AAUACCUG 



UUACUGGU A CAAAUGUA 



UACAUUUG CUGAUGAG GCCGUUAGGC CGAA ACCAGUAA 



ACAAAUGU A GUAAAGAA 



UUCUUUAC CUGAUGAG GCCGUUAGGC CGAA ACAUUUGU 



AAUGUAGU A AAGAAGUG 



CACUUCUU CUGAUGAG GCCGUUAGGC CGAA ACUACAUU 



AGAAGUGU C AGGGAGGC 



GCCUCCCU CUGAUGAG GCCGUUAGGC CGAA ACACUUCU 



GCAGCUGU U AC AC C AAA 



UUUGGUGU CUGAUGAG GCCGUUAGGC CGAA ACAGCUGC 



CAGCUGUU A CACCAAAA 



UUUUGGUG CUGAUGAG GCCGUUAGGC CGAA AACAGCUG 



AUGCACAU U CAAUAAAG 



CUUUAUUG CUGAUGAG GCCGUUAGGC CGAA AUGUGCAU 



UGCACAUU C AAUAAAGU 



ACUUUAUU CUGAUGAG GCCGUUAGGC CGAA AAUGUGCA 



CAUUCAAU A AAGUUACA 



UGUAACUU CUGAUGAG GCCGUUAGGC CGAA AUUGAAUG 



AAUAAAGU U ACAGGACU 



AGUCCUGU CUGAUGAG GCCGUUAGGC CGAA ACUUUAUU 



AUAAAGUU A CAGGACUC 



GAGUCCUG CUGAUGAG GCCGUUAGGC CGAA AACUUUAU 



ACAGGACU C UAUGAAAA 



UUUUCAUA CUGAUGAG GCCGUUAGGC CGAA AGUCCUGU 



AGGACUCU A UGAAAAAG 



CUUUUUCA CUGAUGAG GCCGUUAGGC CGAA AGAGUCCU 



AUGUGAGU U UGUUCUCC 



GGAGAACA CUGAUGAG GCCGUUAGGC CGAA ACUCACAU 



UGUGAGUU U GUUCUCCA 



UGGAGAAC CUGAUGAG GCCGUUAGGC CGAA AACUCACA 



GAGUUUGU U CUCCAAUC 



GAUUGGAG CUGAUGAG GCCGUUAGGC CGAA ACAAACUC 



AGUUUGUU C UCCAAUCC 



GGAUUGGA CUGAUGAG GCCGUUAGGC CGAA AACAAACU 



UUUGUUCU C CAAUCCCG 



CGGGAUUG CUGAUGAG GCCGUUAGGC CGAA AGAACAAA 



UCUCCAAU C CCGCCAGA 



UCUGGCGG CUGAUGAG GCCGUUAGGC CGAA AUUGGAGA 



AGAAGGCU U CUAUAAUG 



CAUUAUAG CUGAUGAG GCCGUUAGGC CGAA AGCCUUCU 



GAAGGCUU C UAUAAUGU 



ACAUUAUA CUGAUGAG GCCGUUAGGC CGAA AAGCCUUC 



AGGCUUCU A UAAUGUUU 



AAACAUUA CUGAUGAG GCCGUUAGGC CGAA AGAAGCCU 



GCUUCUAU A AUGUUUGC 



GCAAACAU CUGAUGAG GCCGUUAGGC CGAA AUAGAAGC 



UAUAAUGU U UGCACAAC 



GUUGUGCA CUGAUGAG GCCGUUAGGC CGAA ACAUUAUA 



AUAAUGUU U GCACAACA 



UGUUGUGC CUGAUGAG GCCGUUAGGC CGAA AACAUUAU 



CAACAUGU U GAUUCUAU 



AUAGAAUC CUGAUGAG GCCGUUAGGC CGAA ACAUGUUG 



AUGUUGAU U CUAUAGUU 



AACUAUAG CUGAUGAG GCCGUUAGGC CGAA AUCAACAU 



UGUUGAUU C UAUAGUUG 



CAACUAUA CUGAUGAG GCCGUUAGGC CGAA AAUCAACA 



UUGAUUCU A UAGUUGAA 



UUCAACUA CUGAUGAG GCCGUUAGGC CGAA AGAAUCAA 



GAUUCUAU A GUUGAAUU 



AAUUCAAC CUGAUGAG GCCGUUAGGC CGAA AUAGAAUC 



UCUAUAGU U GAAUUCUG 



CAGAAUUC CUGAUGAG GCCGUUAGGC CGAA ACUAUAGA 



AGUUGAAU U CUGUACAG 



CUGUACAG CUGAUGAG GCCGUUAGGC CGAA AUUCAACU 



GUUGAAUU C UGUACAGA 



UCUGUACA CUGAUGAG GCCGUUAGGC CGAA AAUUCAAC 



AAUUCUGU A CAGAACAA 



UUGUUCUG CUGAUGAG GCCGUUAGGC CGAA ACAGAAUU 



AAGAAGCU C CAAACAAG 



CUUGUUUG CUGAUGAG GCCGUUAGGC CGAA AGCUUCUU 



AGCAAAAU C AAAAAUGC 



GCAUUUUU CUGAUGAG GCCGUUAGGC CGAA AUUUUGCU 



AAUGCAAU C UCCGAAGC 



GCUUCGGA CUGAUGAG GCCGUUAGGC CGAA AUUGCAUU 



UGCAAUCU C CGAAGCAC 



GUGCUUCG CUGAUGAG GCCGUUAGGC CGAA AGAUUGCA 



GAAGUGAU C CGUGAUUC 



GAAUCACG CUGAUGAG GCCGUUAGGC CGAA AUCACUUC 



UCCGUGAU U CUGAGGAC 



GUCCUCAG CUGAUGAG GCCGUUAGGC CGAA AUCACGGA 



CCGUGAUU C UGAGGACU 



AGUCCUCA CUGAUGAG GCCGUUAGGC CGAA AAUCACGG 



UGAGGACU U UAAGAAAA 



UUUUCUUA CUGAUGAG GCCGUUAGGC CGAA AGUCCUCA 



GAGGACUU U AAGAAAAC 



GUUUUCUU CUGAUGAG GCCGUUAGGC CGAA AAGUCCUC 



AGGACUUU A AGAAAACC 



GGUUUUCU CUGAUGAG GCCGUUAGGC CGAA AAAGUCCU 



AAACCACU C CUAUGACA 



UGUCAUAG CUGAUGAG GCCGUUAGGC CGAA AGUGGUUU 



CCACUCCU A UGACAACA 



UGUUGUCA CUGAUGAG GCCGUUAGGC CGAA AGGAGUGG 



CACCAAAU C CCACCUUC 



GAAGGUGG CUGAUGAG GCCGUUAGGC CGAA AUUUGGUG 



UCCCACCU U CUCAUUGC 



GCAAUGAG CUGAUGAG GCCGUUAGGC CGAA AGGUGGGA 



CCCACCUU C UCAUUGCU 
CACCUUCU C AUUGCUGC " 
" CUUCUCAU U GCUGCAGA ' 
CUGCAGAU U GGACAAAG " 
CAAAGAAU U GUGUGUUU " 
' UUGUGUGU U UAGUCCXJU ' 
" UGUGUGUU U AGUCCUUG " 
" GUGUGUUU A GUCCUUGA " 



UGUUUAGU C CUUGACAA 



AGCAAUGA CUGAUGAG GCCGUUAGGC CGAA AAGGUGGG 
GCAGCAAU CUGAUGAG GCCGUUAGGC CGAA AGAAGGUG 
UCUGCAGC CUGAUGAG GCCGUUAGGC CGAA AUGAGAAG 
CUUUGUCC CUGAUGAG GCCGUUAGGC CGAA AUCUGCAG 
AAACACAC CUGAUGAG GCCGUUAGGC CGAA AUU CUUUG 
AAGGACUA CUGAUGAG GCCGUUAGGC CGAA ACACACAA 
CAA.GGACU CUGAUGAG GCCGUUAGGC CGAA AACACACA 
UCAAGGAC CUGAUGAG GCCGUUAGGC CGAA AAACACAC 



UUGUCAAG CUGAUGAG GCCGUUAGGC CGAA ACUAAACA 



UUAGUCCU U GACAAAUC 



GAUUUGUC CUGAUGAG GCCGUUAGGC CGAA AGGACUAA 



UGACAAAU C UGGAAGCA 



UGCUUCCA CUGAUGAG GCCGUUAGGC CGAA AUUUGUCA 



CGACUGGU A ACCGCCUC 



GAGGCGGU CUGAUGAG GCCGUUAGGC CGAA ACCAGUCG 



AACCGCCU C AAUCGACU 



AGUCGAUU CUGAUGAG GCCGUUAGGC CGAA AGGCGGUU 



GCCUCAAU C GACUGAAU 



AUUCAGUC CUGAUGAG GCCGUUAGGC CGAA AUUGAGGC 



GACUGAAU C AAGCAGGC 



GCCUGCUU CUGAUGAG GCCGUUAGGC CGAA AUUCAGUC 



GGCCAGCU U UUCCUGCU 



AGCAGGAA CUGAUGAG GCCGUUAGGC CGAA AGCUGGCC 



GCCAGCUU U UCCUGCUG 



CAGCAGGA CUGAUGAG GCCGUUAGGC CGAA AAGCUGGC 



CCAGCUUU U CCUGCUGC 



GCAGCAGG CUGAUGAG GCCGUUAGGC CGAA AAAGCUGG 



CAGCUUUU C CUGCUGCA 



UGCAGCAG CUGAUGAG GCCGUUAGGC CGAA AAAAGCUG 



CAGACAGU U GAGCUGGG 



CCCAGCUC CUGAUGAG GCCGUUAGGC CGAA ACUGUCUG 



GCUGGGGU C CUGGGUUG 



CAACCCAG CUGAUGAG GCCGUUAGGC CGAA ACCCCAGC 



UCCUGGGU U GGGAUGGU 



ACCAUCCC CUGAUGAG GCCGUUAGGC CGAA ACCCAGGA 



GGUGACAU U UGACAGUG 



CACUGUCA CUGAUGAG GCCGUUAGGC CGAA AUGUCACC 



GUGACAUU U GACAGUGC 



GCACUGUC CUGAUGAG GCCGUUAGGC CGAA AAUGUCAC 



GCCCAUGU A CAAAGUGA 



UCACUUUG CUGAUGAG GCCGUUAGGC CGAA ACAUGGGC 



AGUGAACU C AUACAGAU 



AUCUGUAU CUGAUGAG GCCGUUAGGC CGAA AGUUCACU 



GAACUCAU A CAGAUAAA 



UUUAUCUG CUGAUGAG GCCGUUAGGC CGAA AUGAGUUC 



AUACAGAU A AACAGUGG 



CCACUGUU CUGAUGAG GCCGUUAGGC CGAA AUCUGUAU 



GACACACU C GCCAAAAG 



CUUUUGGC CUGAUGAG GCCGUUAGGC CGAA AGUGUGUC 



CAAAAGAU U ACCUGCAG 



CUGCAGGU CUGAUGAG GCCGUUAGGC CGAA AUCUUUUG 



AAAAGAUU A CCUGCAGC 



GCUGCAGG CUGAUGAG GCCGUUAGGC CGAA AAUCUUUU 



CAGCAGCU U CAGGAGGG 



CCCUCCUG CUGAUGAG GCCGUUAGGC CGAA AGCUGCUG 



AGCAGCUU C AGGAGGGA 



UCCCUCCU CUGAUGAG GCCGUUAGGC CGAA AAGCUGCU 



AGGGACGU C CAUCUGCA 



UGCAGAUG CUGAUGAG GCCGUUAGGC CGAA ACGUCCCU 



ACGUCCAU C UGCAGCGG 



CCGCUGCA CUGAUGAG GCCGUUAGGC CGAA AUGGACGU 



AGCGGGCU U CGAUCGGC 



GCCGAUCG CUGAUGAG GCCGUUAGGC CGAA AGCCCGCU 



GCGGGCUU C GAUCGGCA 



UGCCGAUC CUGAUGAG GCCGUUAGGC CGAA AAGCCCGC 



GCUUCGAU C GGCAUUUA 



UAAAUGCC CUGAUGAG GCCGUUAGGC CGAA AUCGAAGC 



AUCGGCAU U UACUGUGA 



UCACAGUA CUGAUGAG GCCGUUAGGC CGAA AUGCCGAU 



UCGGCAUU U ACUGUGAU 



AUCACAGU CUGAUGAG GCCGUUAGGC CGAA AAUGCCGA 



CGGCAUUU A CUGUGAUU 



AAUCACAG CUGAUGAG GCCGUUAGGC CGAA AAAUGCCG 



ACUGUGAU U AGGAAGAA 



UUCUUCCU CUGAUGAG GCCGUUAGGC CGAA AUCACAGU 



CUGUGAUU A GGAAGAAA 



UUUCUUCC CUGAUGAG GCCGUUAGGC CGAA AAUCACAG 



1550 
1552 



GAAGAAAU A UCCAACUG 



CAGUUGGA CUGAUGAG GCCGUUAGGC CGAA AUUUCUUC 



AGAAAUAU C CAACUGAU 



AUCAGUUG CUGAUGAG GCCGUUAGGC CGAA AUAUUUCU 



1565 
1572 
1603 



UGAUGGAU C UGAAAUUG 



CAAUUUCA CUGAUGAG GCCGUUAGGC CGAA AUCCAUCA 



UCUGAAAU U GUGCUGCU 



AGCAGCAC CUGAUGAG GCCGUUAGGC CGAA AUUUCAGA 



ACAACACU A UAAGUGGG 



CCCACUUA CUGAUGAG GCCGUUAGGC CGAA AGUGUUGU 



AACACUAU A AGUGGGUG 



UGGGUGCU U UAACGAGG 



CACCCACU CUGAUGAG GCCGUUAGGC CGAA AUAGUGUU 



GGGUGCUU U AACGAGGU 



CCUCGUUA CUGAUGAG GCCGUUAGGC CGAA AGCACCCA 



GGUGCUUU A ACGAGGUC 



ACCUCGUU CUGAUGAG GCCGUUAGGC CGAA i 
GACCUCGU CUGAUGAG GCCGUUAGGC CGAA i 



AACGAGGU C AAACAAAG 



GGUGCCAU C AUCCACAC 



CUUUGUUU CUGAUGAG GCCGUUAGGC CGAA ACCUCGUU 



GCCAUCAU C CACACAGU 



GUGUGGAU CUGAUGAG GCCGUUAGGC CGAA AUGGCACC 



CACACAGU C GCUUUGGG 



ACUGUGUG CUGAUGAG GCCGUUAGGC CGAA AUGAUGGC 



CCCAAAGC CUGAUGAG GCCGUUAGGC CGAA ACUGUGUG 



58 



1660 


CAGUCGCU 




r 

JGGGGCCC 




GGGCCCCA 


CUGAUGAG 


GCCGUUAGGC 


CGAA 


AGCGACUG 


2466 


1661 


AGUCGCUU 


— 


o Li Vl- ^ l. u 




AGGGCCCC 


CUGAUGAG 


GCCGUUAGGC 


CGAA 


AAGCGACU 


2467 


1670 


p p p p p p pt t 




JGCAGCUC 




GAGCUGCA 


CUGAUGAG 


GCCGUUAGGC 


CGAA 


AGGGCCCC 


2468 


1678 


CUGCAGCU 




AAGAACUA 




JAGUUeuU 


CUGAUGAG 


GCCGUUAGGC 


CGAA 


AGCUGCAG 


2469 


1686 


CAAGAACU 


— 






AGCUCCUC 


CUGAUGAG 


GCCGUUAGGC 


CGAA 


AGUUCUUG 


2470 


1697 


GGAGCUGU 


— 


CAAAAUGA 




JCAUUUUG 


CUGAUGAG 


GCCGUUAGGC 


CGAA 


ACAGCUCC 


2471 




CAGGAGGU 


— 


U ALAuH.LA 




UGUCUGUA 


CUGAUGAG 


GCCGUUAGGC 


CGAA 


ACCUCCUG 


2472 




AGGAGGUU 


— 






AUGUCUGU 


CUGAUGAG 


GCCGUUAGGC 


CGAA 


AAC CUC CU 


2473 




pp apptttttt 


— 


CAGACAUA 




UAUGUCUG 


CUGAUGAG 


GCCGUUAGGC 


CGAA 


AAACCUCC 


2474 


1724 


ACAGACAU 




A 


UGCUUCAG 




CUGAAGCA 


CUGAUGAG 


GCCGUUAGGC 


CGAA 


AUGUCUGU 


2475 


1729 


C AU AUG CU 




CAGAUCAA 


28? 


UUGAUCUG 


CUGAUGAG 


GCCGUUAGGC 


CGAA 


AGCAUAUG 


2476 


1730 


AUAUGCUU 


— 


AGAU C AAG 


2g8 


CUUGAUCU 


CUGAUGAG 


GCCGUUAGGC 


CGAA 


AAGCAUAU 


2477 


1735 


CUUCAGAU 


— 


AAGUUCAG 




289 


CUGAACUU 


CUGAUGAG 


GCCGUUAGGC 


CGAA 


AUCUGAAG 


2478 


1740 


GAUCAAGU 


— 


C AG AAC AA 




UUGUUCUG 


CUGAUGAG 


GCCGUUAGGC 


CGAA 


ACUUGAUC 


2479 




AUCAAGUU 




AGAACAAU 




AUUGUUCU 


CUGAUGAG 


GCCGUUAGGC 


CGAA 


AACUUGAU 


2480 


1755 


AAUGGCCU 


— 


AUUG AUG C 




292 


G C AUC AAU 


CUGAUGAG 


GCCGUUAGGC 


CGAA 


AGGCCAUU 


2481 


1758 


GGCCUCAU 


— 


G AUG CUUU 




AAAG C AUC 


CUGAUGAG 


GCCGUUAGGC 




AUGAGGCC 


2482 


1765 


UUGAUGCU 


— 


t tt ic* r* r* nr r* 

U Uu-oajvtV-.^ 




GGCCCCAA 


CUGAUGAG 


GCCGUUAGGC 


CGAA 


AGCAUCAA 


2483 




UG AUG CUU 


— 


UGGGGCCC 


295 


GGGCCCCA 


CUGAUGAG 


GCCGUUAGGC 


CGAA 


AAGCAUCA 


2484 


1767 


GAUGCUUU 


— 


IjLr^Ljt— u 




AGGGCCCC 


CUGAUGAG 


GCCGUUAGGC 


CGAA 


AAAGCAUC 


2485 


1776 


GGGGCCCU 


— 


UCAUCAGG 




CCUGAUGA 


CUGAUGAG 


GCCGUUAGGC 


CGAA 


AGGGCCCC 


2486 


1777 


GGGCCCUU 


— 


U LHuuH 




UC CUG AUG 


CUGAUGAG 


GCCGUUAGGC 


CGAA 


AAGGGCCC 


2487 


1778 


GGCCCUUU 


— 






UUCCUGAU 


CUGAUGAG 


GCCGUUAGGC 


CGAA 


AAAGGGCC 


2488 


1781 


CCUUUCAU 


— 


AGGAAAUG 




CAUUUCCU 


CUGAUGAG 


GCCGUUAGGC 


CGAA 


AUGAAAGG 


2489 


1797 


GGAGCUGU 


— 


UCUCAGCG 


— oi~ 


CGCUGAGA 


CUGAUGAG 


GCCGUUAGGC 


CGAA 


ACAGCUCC 


2490 




AGCUGUCU 


— 


t tp ft p p p p t t 




AGCGCUGA 


CUGAUGAG 


GCCGUUAGGC 


CGAA 


AGACAGCU 


2491 




CUGUCUCU 


— 


U Lj I— U v_. 




GGAGCGCU 


CUGAUGAG 


GCCGUUAGGC 


CGAA 


AGAGACAG 


2492 


1808 


UCAGCGCU 


— 


ptittppapp 


3Q4 


GCUGGAUG 


CUGAUGAG 


GCCGUUAGGC 


CGAA 


AGCGCUGA 


2493 


1812 


PPPTTPPZ1TT 




CAGCUUGA 




UCAAGCUG 


CUGAUGAG 


GCCGUUAGGC 


CGAA 


AUGGAGCG 


2494 


1818 


AUCCAGCU 





GAGAGUAA 




UUACUCUC 


CUGAUGAG 


GCCGUUAGGC 


CGAA 


AG CUGG AU 


2495 


1825 


UUGAGAGU 


— 


AGGGAUUA 




UAAUCCCU 


CUGAUGAG 


GCCGUUAGGC 


CGAA 


ACUCUCAA 


2496 


1832 


UAAGGGAU 


— 


AAC C CUC C 




308 


GGAGG GUU 


CUGAUGAG 


GCCGUUAGGC 




AUCCCUUA 


2497 


1833 


AAGGGAUU 


— 


At— \- \- U L. 




UGGAGGGU 


CUGAUGAG 


GCCGUUAGGC 


CGAA 


AAUCCCUU 


2498 


1839 


UUAACCCU 


— 


hah! a a pari 




CUGUUCUG 


CUGAUGAG 


GCCGUUAGGC 


CGAA 


AGGGUUAA 


2499 




AGAGUGAU 


— 


GUGGACAG 




CUGUCCAC 


CUGAUGAG 


GCCGUUAGGC 


CGAA 


AUCACUGU 


2500 




AGGACACU 




UGUUUCUU 




AAGAAACA 


CUGAUGAG 


GCCGUUAGGC 


CGAA 


AGUGUCCU 


2501 




GGACACUU 






3 


UAAGAAAC 


CUGAUGAG 


GCCGUUAGGC 


CGAA 


AAGUGUCC 


2502 


1904 


CACUUUGU 


— 


UCUUAUCA 




UGAUAAGA 


CUGAUGAG 


GCCGUUAGGC 


CGAA 


ACAAAGUG 


2503 


1905 


A PTTTTTTPTTTT 




CUUAUCAC 




GUGAUAAG 


CUGAUGAG 


GCCGUUAGGC 


CGAA 


AACAAAGU 


2504 




CUUUGUUU 


— 


UUAU C AC C 




GGUGAUAA 


CUGAUGAG 


GCCGUUAGGC 


CGAA 


AAACAAAG 


2505 


1908 


UUGUUUCU 





AUC AC CUG 




CAGGUGAU 


CUGAUGAG 


GCCGUUAGGC 


CGAA 


AGAAACAA 


2506 


1909 


UGUUUCUU 


— 


UC AC CUGG 




CCAGGUGA 


CUGAUGAG 


GCCGUUAGGC 


CGAA 


AAGAAACA 


2507 


1911 


UUUCUUAU 




C 


AC CUGGAC 




319 


GUC C AGGU 


CUGAUGAG 


GCCGUUAGGC 


CGAA 


AUAAGAAA 


2508 


1930 


CGCAGCCU 


° 


CCCAAAUC 




GAUUUGGG 


CUGAUGAG 


GCCGUUAGGC 




AGGCUGCG 


2509 


1938 


CCCCAAAU 


C 


CUUCUCUG 






CUGAUGAG 


GCCGUUAGGC 




AUUUGGGG 


2510 


1941 


CAAAUCCU 


U 


CUCUGGGA 


322 


U C C C AG AG 


CUGAUGAG 


GCCGUUAGGC 


CGAA 


AGGAUUUG 


2511 




AAAUCCUU 




UCUGGGAU 




AUCC C AGA 


CUGAUGAG 


GCCGUUAGGC 


CGAA 


AAGGAUUU 


2512 


1944 


AUCCUUCU 


— 


UGGGAUCC 




324 




CUGAUGAG 


GCCGUUAGGC 


CGAA 


AGAAGGAU 


2513 




UCUGGGAU 


— 


CCAGUGGA 




UCCACUGG 


CUGAUGAG 


GCCGUUAGGC 


CGAA 


AUCCCAGA 


2514 




t\ ppttppptt 




UGUAGUGG 




CCACUACA 


CUGAUGAG 


GCCGUUAGGC 


CGAA 


AGCCACCU 


2515 




GGUGGCUU 


u 


GUAGUGGA 


327 


UCCACUAC 


CUGAUGAG 


GCCGUUAGGC 


CGAA 


AAGCCACC 


2516 


1980 


GGCUUUGU 


A 


GUGGACAA 


328 


UUGUCCAC 


CUGAUGAG 


GCCGUUAGGC 


CGAA 


ACAAAGCC 


2517 


2006 


AAUGGCCU 


A 


CCUCCAAA 


329 


UUUGGAGG 


CUGAUGAG 


GCCGUUAGGC 


CGAA 


AGGCCAUU 


2518 


2010 


GCCUACCU 


C 


CAAAUCCC 


330 


GGGAUUUG 


CUGAUGAG 


GCCGUUAGGC 


CGAA 


AGGUAGGC 


2519 


2016 


CUCCAAAU 


C 


CCAGGCAU 


331 


AUGCCUGG 


CUGAUGAG 


GCCGUUAGGC 


CGAA 


AUUUGGAG 


2520 


2025 


CCAGGCAU 


U 


GCUAAGGU 


332 


ACCUUAGC 


CUGAUGAG 


GCCGUUAGGC 


CGAA 


AUGCCUGG 


2521 



GCAUUGCU A AGGUUGGC 



GCCAACCU CUGAUGAG GCCGUUAGGC CGAA AGCAAUGC 



GCUAAGGU U GGCACUUG 



CAAGUGCC CUGAUGAG GCCGUUAGGC CGAA ACCUUAGC 



UUGGCACU U GGAAAUAC 



GUAUUUCC CUGAUGAG GCCGUUAGGC CGAA AGUGCCAA 



UUGGAAAU A CAGUCUGC 



GCAGACUG CUGAUGAG GCCGUUAGGC CGAA AUUUCCAA 



AAUACAGU C UGCAAGCA 



UGCUUGCA CUGAUGAG GCCGUUAGGC CGAA ACUGUAUU 



AGCAAGCU C ACAAACCU 



AGGUUUGU CUGAUGAG GCCGUUAGGC CGAA AGCUUGCU 



ACAAACCU U GACCCUGA 



UCAGGGUC CUGAUGAG GCCGUUAGGC CGAA AGGUUUGU 



CUGACUGU C ACGUCCCG 



CGGGACGU CUGAUGAG GCCGUUAGGC CGAA ACAGUCAG 



UGUCACGU C CCGUGCGU 



ACGCACGG CUGAUGAG GCCGUUAGGC CGAA ACGUGACA 



CCGUGCGU C CAAUGCUA 



UAGCAUUG CUGAUGAG GCCGUUAGGC CGAA ACGCACGG 



CCAAUGCU A CCCUGCCU 



AGGCAGGG CUGAUGAG GCCGUUAGGC CGAA AGCAUUGG 



CCCUGCCU C CAAUUACA 



UGUAAUUG CUGAUGAG GCCGUUAGGC CGAA AGGCAGGG 



CCUCCAAU U ACAGUGAC 



GUCACUGU CUGAUGAG GCCGUUAGGC CGAA AUUGGAGG 



CUCCAAUU A CAGUGACU 

CAGUGACU U CCAAAACG 
" AGUGACUU C CAAAACGA 
" CAGCAAAU U CCCCAGCC 
" AGCAAAUU C CCCAGCCC 

CCAGCCCU C UGGUAGUU 
" CCUCUGGU A GUUUAUGC ~ 
" CUGGUAGU U UAUGCAAA 
" UGGUAGUU U AUGCAAAU 
' GGUAGUUU A UGCAAAUA 
" AUGCAAAU A UUCGCCAA 
" GCAAAUAU U CGCCAAGG 
" CAAAUAUU C GCCAAGGA 
" AGGAGCCU C CCCAAUUC 
" UCCCCAAU U CUCAGGGC 
" CCCCAAUU C UCAGGGCC 
" CCAAUUCU C AGGGCCAG 
" GCCAGUGU C ACAGCCCU 

GCCCUGAU U GAAUCAGU 
" GAUUGAAU C AGUGAAUG 

AAAACAGU U ACCUUGGA 
" AAACAGUU A CCUUGGAA 
" AGUUACCU U GGAACUAC 
~ UUGGAACU A CUGGAUAA 
~ UACUGGAU A AUGGAGCA 
~ CUGAUGCU A CUAAGGAU 

AUGCUACU A AGGAUGAC 
" GACGGUGU C UACUCAAG 

CGGUGUCU A CUCAAGGU 
~ UGUCUACU C AAGGUAUU " 

CUCAAGGU A UUUCACAA 
~ CAAGGUAU U UCACAACU 
r AAGGUAUU U CACAACUU 
~ AGGUAUUU C ACAACUUA 
~ UCACAACU U AUGACACG 
I CACAACUU A UGACACGA 
r CGAAUGGU A GAUACAGU 
~ UGGUAGAU A CAGUGUAA 
f UACAGUGU A AAAGUGCG 
~ UGCGGGCU C UGGGAGGA 
r GGAGGAGU U AACGCAGC 
~ GAGGAGUU A ACGCAGCC " 
! AGAGUGAU A CCCCAGCA 



AGUCACUG CUGAUGAG GCCGUUAGGC CGAA AAUUGGAG 
CGUUUUGG CUGAUGAG GCCGUUAGGC CGAA AGUCACUG 
UCGUUUUG CUGAUGAG GCCGUUAGGC CGAA AAGUCACUl 2 53 7 
GGCUGGGG CUGAUGAG GCCGUUAGGC CGAA AUUUGCUg) 2 538~ 
GGGCUGGG CUGAUGAG GCCGUUAGGC CGAA AAUUUGCU 
AAC UACCA CUGAUGAG GCCGUUAGGC CGAA AGGGCUGG 
GCAUAAAC CUGAUGAG GCCGUUAGGC CGAA ACCAGAGG 
UUUGCAUA CUGAUGAG GCCGUUAGGC CGAA ACUACCAG 
AUUUGCAU CUGAUGAG GCCGUUAGGC CGAA AACUACCA 
UAUUUGCA CUGAUGAG GCCGUUAGGC CGAA AAACUACC 
UUG GCGAA CUGAUGAG GCCGUUAGGC CGAA AUUUGCAU 
CCUUGGCG CUGAUGAG GCCGUUAGGC CGAA AUAUUUGC 
UCCUUGGC CUGAUGAG GCCGUUAGGC CGAA AAUAUUUG 
GAAUUGGG CUGAUGAG GCCGUUAGGC CGAA AGGCUCCU 
GCCCUGAG CUGAUGAG GCCGUUAGGC CGAA AUUGGGGA 
GGCCCUGA CUGAUGAG GCCGUUAGGC CGAA AAUUGGGG 
" CUGGCCCU CUGAUGAG GCCGUUAGGC CGAA AGAAUUGG 
~ AGGGCUGU CUGAUGAG GCCGUUAGGC CGAA ACACUGGC 
~ A CUGAUUC CUGAUGAG GCCGUUAGGC CGAA AUCAGGGC 
" CAUUCACU CUGAUGAG GCCGUUAGGC CGAA AUUCAAUC I 1 
~ UCCAAGGU CUGAUGAG GCCGUUAGGC CGAA ACUGUUUU 1 
" UUCCAAGG CUGAUGAG GCCGUUAGGC CGAA AACUGUUU 2 
~ GUAGUUCC CUGAUGAG GCCGUUAGGC CGAA AGGUAACU : 
~ UUAUCCAG CUGAUGAG GCCGUUAGGC CGAA AGUUCCAA S 
~ UGCUCCAU CUGAUGAG GCCGUUAGGC CGAA AUCCAGUA : 
"" AUCCUUAG CUGAUGAG GCCGUUAGGC CGAA AGCAUCAG : 
GUCAUCCU CUGAUGAG GCCGUUAGGC CGAA AGUAGCAU ; 
~ CUUGAGUA CUGAUGAG GCCGUUAGGC CGAA ACACCGUC ; 
~ ACCUUGAG CUGAUGAG GCCGUUAGGC CGAA AGACACCG ; 
~ AAUACCUU CUGAUGAG GCCGUUAGGC CGAA AGUAGACA ; 
~ UUGUGAAA CUGAUGAG GCCGUUAGGC CGAA ACCUUGAG ; 
~ AGUUGUGA CUGAUGAG GCCGUUAGGC CGAA AUACCUUG : 
~ AAGUUGUG CUGAUGAG GCCGUUAGGC CGAA AAUACCUU : 
" UAAGUUGU CUGAUGAG GCCGUUAGGC CGAA AAAUACCU : 
~ CGUGUCAU CUGAUGAG GCCGUUAGGC CGAA AGUUGUGA \ 
~ UCGUGUCA CUGAUGAG GCCGUUAGGC CGAA AAGUUGUG : 
~ ACUGUAUC CUGAUGAG GCCGUUAGGC CGAA ACCAUUCG : 
~ UUACACUG CUGAUGAG GCCGUUAGGC CGAA AUCUACCA : 
~ CGCACUUU CUGAUGAG GCCGUUAGGC CGAA ACACUGUA : 
~ UCCUCCCA CUGAUGAG GCCGUUAGGC CGAA AGCCCGCA : 
~ GCUGCGUU CUGAUGAG GCCGUUAGGC CGAA ACUCCUCC : 
~ GGCUGCGU CUGAUGAG GCCGUUAGGC CGAA AACUCCUC : 
~~ UGCUGGGG CUGAUGAG GCCGUUAGGC CGAA AUCACUCU : 



AGCACUGU A CAUACCUG 



CAGGUAUG CUGAUGAG GCCGUUAGGC CGAA ACAGUGCU 



CUGUACAU A CCUGGCUG 



CAGCCAGG CUGAUGAG GCCGUUAGGC CGAA AUGUACAG 



GGCUGGAU U GAGAAUGA 



UCAUUCUC CUGAUGAG GCCGUUAGGC CGAA AUCCAGCC 



GAUGAAAU A CAAUGGAA 



UUCCAUUG CUGAUGAG GCCGUUAGGC CGAA AUUUCAUC 



AAUGGAAU C CACCAAGA 



UCUUGGUG CUGAUGAG GCCGUUAGGC CGAA AUUCCAUU 



CCUGAAAU U AAUAAGGA 



UCCUUAUU CUGAUGAG GCCGUUAGGC CGAA AUUUCAGG 



CUGAAAUU A AUAAGGAU 



AUCCUUAU CUGAUGAG GCCGUUAGGC CGAA AAUUUCAG 



AAAUUAAU A AGGAUGAU 



AUCAUCCU CUGAUGAG GCCGUUAGGC CGAA AUUAAUUU 



GAUGAUGU U CAACACAA 



UUGUGUUG CUGAUGAG GCCGUUAGGC CGAA ACAUCAUC 



AUGAUGUU C AACACAAG 



CUUGUGUU CUGAUGAG GCCGUUAGGC CGAA AACAUCAU 



AAGUGUGU U UCAGCAGA 



UCUGCUGA CUGAUGAG GCCGUUAGGC CGAA ACACACUU 



AGUGUGUU U CAGCAGAA 



UUCUGCUG CUGAUGAG GCCGUUAGGC CGAA AACACACU 



GUGUGUUU C AGCAGAAC 



GUUCUGCU CUGAUGAG GCCGUUAGGC CGAA AAACACAC 



CAGAACAU C CUCGGGAG 



CUCCCGAG CUGAUGAG GCCGUUAGGC CGAA AUGUUCUG 



AACAUCCU C GGGAGGCU 



AGCCUCCC CUGAUGAG GCCGUUAGGC CGAA AGGAUGUU 



GGGAGGCU C AUUUGUGG 



CCACAAAU CUGAUGAG GCCGUUAGGC CGAA AGCCUCCC 



AGGCUCAU U UGUGGCUU 



AAGCCACA CUGAUGAG GCCGUUAGGC CGAA AUGAGCCU 



GGCUCAUU U GUGGCUUC 



GAAGCCAC CUGAUGAG GCCGUUAGGC CGAA AAUGAGCC 



UUGUGGCU U CUGAUGUC 



GACAUCAG CUGAUGAG GCCGUUAGGC CGAA AGCCACAA 



UGUGGCUU C UGAUGUCC 



GGACAUCA CUGAUGAG GCCGUUAGGC CGAA AAGCCACA 



UCUGAUGU C CCAAAUGC 



GCAUUUGG CUGAUGAG GCCGUUAGGC CGAA ACAUCAGA 



CAAAUGCU C CCAUACCU 



AGGUAUGG CUGAUGAG GCCGUUAGGC CGAA AGCAUUUG 



GCUCCCAU A CCUGAUCU 



AGAUCAGG CUGAUGAG GCCGUUAGGC CGAA AUGGGAGC 



UACCUGAU C UCUUCCCA 



UGGGAAGA CUGAUGAG GCCGUUAGGC CGAA AUCAGGUA 



CCUGAUCU C UUCCCACC 



GGUGGGAA CUGAUGAG GCCGUUAGGC CGAA AGAUCAGG 



UGAUCUCU U CCCACCUG 



CAGGUGGG CUGAUGAG GCCGUUAGGC CGAA AGAGAUCA 



GAUCUCUU C CCACCUGG 



CCAGGUGG CUGAUGAG GCCGUUAGGC CGAA AAGAGAUC 



GGCCAAAU C ACCGACCU 



AGGUCGGU CUGAUGAG GCCGUUAGGC CGAA AUUUGGCC 



GCGGAAAU U CACGGGGG 



CCCCCGUG CUGAUGAG GCCGUUAGGC CGAA AUUUCCGC 



CGGAAAUU C ACGGGGGC 



GCCCCCGU CUGAUGAG GCCGUUAGGC CGAA AAUUUCCG 



GGGGCAGU C UCAUUAAU 



AUUAAUGA CUGAUGAG GCCGUUAGGC CGAA ACUGCCCC 



GGCAGUCU C AUUAAUCU 



AGAUUAAU CUGAUGAG GCCGUUAGGC CGAA AGACUGCC 



AGUCUCAU U AAUCUGAC 



GUCAGAUU CUGAUGAG GCCGUUAGGC CGAA AUGAGACU 



GUCUCAUU A AUCUGACU 



AGUCAGAU CUGAUGAG GCCGUUAGGC CGAA AAUGAGAC 



UCAUUAAU C UGACUUGG 



CCAAGUCA CUGAUGAG GCCGUUAGGC CGAA AUUAAUGA 



AUCUGACU U GGACAGCU 



AGCUGUCC CUGAUGAG GCCGUUAGGC CGAA AGUCAGAU 



GGACAGCU C CUGGGGAU 



AUCCCCAG CUGAUGAG GCCGUUAGGC CGAA AGCUGUCC 



GGGAUGAU U AUGACCAU 



AUGGUCAU CUGAUGAG GCCGUUAGGC CGAA AUCAUCCC 



GGAUGAUU A UGACCAUG 



CAUGGUCA CUGAUGAG GCCGUUAGGC CGAA AAUCAUCC 



GAACAGCU C ACAAGUAU 



AUACUUGU CUGAUGAG GCCGUUAGGC CGAA AGCUGUUC 



UCACAAGU A UAUCAUUC 



GAAUGAUA CUGAUGAG GCCGUUAGGC CGAA ACUUGUGA 



ACAAGUAU A UCAUUCGA 



UCGAAUGA CUGAUGAG GCCGUUAGGC CGAA AUACUUGU 



AAGUAUAU C AUUCGAAU 



AUUCGAAU CUGAUGAG GCCGUUAGGC CGAA AUAUACUU 



UAUAUCAU U CGAAUAAG 



CUUAUUCG CUGAUGAG GCCGUUAGGC CGAA AUGAUAUA 



AUAUCAUU C GAAUAAGU 



ACUUAUUC CUGAUGAG GCCGUUAGGC CGAA AAUGAUAU 



AUUCGAAU A AGUACAAG 



CUUGUACU CUGAUGAG GCCGUUAGGC CGAA AUUCGAAU 



GAAUAAGU A CAAGUAUU 



AAUACUUG CUGAUGAG GCCGUUAGGC CGAA ACUUAUUC 



GUACAAGU A UUCUUGAU 



AUCAAGAA CUGAUGAG GCCGUUAGGC CGAA ACUUGUAC 



ACAAGUAU U CUUGAUCU 



AGAUCAAG CUGAUGAG GCCGUUAGGC CGAA AUACUUGU 



CAAGUAUU C UUGAUCUC 



GAGAUCAA CUGAUGAG GCCGUUAGGC CGAA AAUACUUG 



AGUAUUCU U GAUCUCAG 



CUGAGAUC CUGAUGAG GCCGUUAGGC CGAA AGAAUACU 



UUCUUGAU C UCAGAGAC 



GUCUCUGA CUGAUGAG GCCGUUAGGC CGAA AUCAAGAA 



CUUGAUCU C AGAGACAA 



UUGUCUCU CUGAUGAG GCCGUUAGGC CGAA AGAUCAAG 



AGACAAGU U CAAUGAAU 



AUUCAUUG CUGAUGAG GCCGUUAGGC CGAA ACUUGUCU 



GACAAGUU C AAUGAAUC 



GAUUCAUU CUGAUGAG GCCGUUAGGC CGAA AACUUGUC 



CAAUGAAU C UCUUCAAG 



CUUGAAGA CUGAUGAG GCCGUUAGGC CGAA AUUCAUUG 



AUGAAUCU C UUCAAGUG 
" GAAUCUCU U CAAGUGAA 
" AAUCUCUU C AAGUGAAU 
" AAGUGAAU A CUACUGCU " 
" UGAAUACU A CUGCUCUC 
" CUACUGCU C UCAUCCCA 
" ACUGCUCU C AUCCCAAA 
" GCUCUCAU C CCAAAGGA " 
" AGCCAACU C UGAGGAAG 
" GAGGAAGU C UUUUUGUU 
" GGAAGUCU U UUUGUUUA 
" GAAGUCUU U UUGUUUAA " 
" AAGUCUUU U UGUUUAAA " 
" AGUCUUUU U GUUUAAAC " 
" CUUUUUGU U UAAACCAG " 
" UUUUUGUU U AAACCAGA " 
" UUUUGUUU A AACCAGAA " 
" GAAAACAU U ACUUUUGA " 
" AAAACAUU A CUUUUGAA " 
" ACAUUACU U UUGAAAAU " 
" CAUUACUU U UGAAAAUG 
" AUUACUUU U GAAAAUGG " 
" GCACAGAU C UUUUCAUU 
" ACAGAUCU U UUCAUUGC 
" CAGAUCUU U UCAUUGCU " 

AGAUCUUU U CAUUGCUA 
" GAUCUUUU C AUUGCUAU " 
" CUUUUCAU U GCUAUUCA " 
" UCAUUGCU A UUCAGGCU " 
" AUUGCUAU U CAGGCUGU " 
" UUGCUAUU C AGGCUGUU " 
" CAGGCUGU U GAUAAGGU " 
" CUGUUGAU A AGGUCGAU " 
" GAUAAGGU C GAUCUGAA " 
" AGGUCGAU C UGAAAUCA " 

UCUGAAAU C AGAAAUAU " 
" UCAGAAAU A UCCAACAU " 
" AGAAAUAU C CAACAUUG " 
" UCCAACAU U GCACGAGU ' 

GCACGAGU A UCUUUGUU " 
" ACGAGUAU C UUUGUUUA " 
" GAGUAUCU U UGUUUAUU " 
" AGUAUCUU U GUUUAUUC " 
" AUCUUUGU U UAUUCCUC " 
" UCUUUGUU U AUUCCUCC " 
" CUUUGUUU A UUCCUCCA " 
" UUGUUUAU U CCUCCACA " 
" UGUUUAUU C CUCCACAG " 
" UUAUUCCU C CACAGACU " 

CACAGACU C CGCCAGAG 
" AGACACCU A GUCCUGAU " 
" CACCUAGU C CUGAUGAA " 

UGAAACGU C UGCUCCUU 
" CGUCUGCU C CUUGUCCU " 
" CUGCUCCU U GUCCUAAU " 
" CUCCUUGU C CUAAUAUU " 



CACUUGAA CUGAUGAG GCCGUUAGGC CGAA AGAUUCAU 

UUCACUUG CUGAUGAG GCCGUUAGGC CGAA AGAGAUUC 

AUUCACUU CUGAUGAG GCCGUUAGGC CGAA AAGAGAUU 

AGCAGUAG CUGAUGAG GCCGUUAGGC CGAA AUUCACUU 

GAGAGCAG CUGAUGAG GCCGUUAGGC CGAA AGUAUUCA 

UGGGAUGA CUGAUGAG GCCGUUAGGC CGAA AGCAGUAG 

UUUGGGAU CUGAUGAG GCCGUUAGGC CGAA AGAGCAGU 

UCCUUUGG CUGAUGAG GCCGUUAGGC CGAA AUGAGAGC 

CUUCCUCA CUGAUGAG GCCGUUAGGC CGAA AGUUGGCU 

AACAAAAA CUGAUGAG GCCGUUAGGC CGAA ACUUCCUC 

UAAACAAA CUGAUGAG GCCGUUAGGC CGAA AGACUUCC 

UUAAACAA CUGAUGAG GCCGUUAGGC CGAA AAGACUUC 

UUUAAACA CUGAUGAG GCCGUUAGGC CGAA AAAGACUU 

GUUUAAAC CUGAUGAG GCCGUUAGGC CGAA AAAAGACU 

CUGGUUUA CUGAUGAG GCCGUUAGGC CGAA ACAAAAAG 

UCUGGUUU CUGAUGAG GCCGUUAGGC CGAA AACAAAAA I 

UUCUGGUU CUGAUGAG GCCGUUAGGC CGAA AAACAAAA 

UCAAAAGU CUGAUGAG GCCGUUAGGC CGAA AUGUUUUC 

UUCAAAAG CUGAUGAG GCCGUUAGGC CGAA AAUGUUUU 

AUUUUCAA CUGAUGAG GCCGUUAGGC CGAA AGUAAUGU 

CAUUUUCA CUGAUGAG GCCGUUAGGC CGAA AAGUAAUG 

CCAUUUUC CUGAUGAG GCCGUUAGGC CGAA AAAGUAAU 

AAUGAAAA CUGAUGAG GCCGUUAGGC CGAA AUCUGUGC 

GCAAUGAA CUGAUGAG GCCGUUAGGC CGAA AGAUCUGU 

AGCAAUGA CUGAUGAG GCCGUUAGGC CGAA AAGAUCUG 

UAGCAAUG CUGAUGAG GCCGUUAGGC CGAA AAAGAUCU 

AUAGCAAU CUGAUGAG GCCGUUAGGC CGAA AAAAGAUC 

UGAAUAGC CUGAUGAG GCCGUUAGGC CGAA AUGAAAAG 

AGCCUGAA CUGAUGAG GCCGUUAGGC CGAA AGCAAUGA 

ACAGCCUG CUGAUGAG GCCGUUAGGC CGAA AUAGCAAU 

AACAGCCU CUGAUGAG GCCGUUAGGC CGAA AAUAGCAA 

ACCUUAUC CUGAUGAG GCCGUUAGGC CGAA ACAGCCUG 

AUCGACCU CUGAUGAG GCCGUUAGGC CGAA AUCAACAG 

UUCAGAUC CUGAUGAG GCCGUUAGGC CGAA ACCUUAUC 

UGAUUUCA CUGAUGAG GCCGUUAGGC CGAA AUCGACCU 

AUAUUUCU CUGAUGAG GCCGUUAGGC CGAA AUUUCAGA 

AUGUUGGA CUGAUGAG GCCGUUAGGC CGAA AUUUCUGA 

CAAUGUUG CUGAUGAG GCCGUUAGGC CGAA AUAUUUCU 

ACUCGUGC CUGAUGAG GCCGUUAGGC CGAA AUGUUGGA 

AACAAAGA CUGAUGAG GCCGUUAGGC CGAA ACUCGUGC 

UAAACAAA CUGAUGAG GCCGUUAGGC CGAA AUACUCGU 

AAUAAACA CUGAUGAG GCCGUUAGGC CGAA AGAUACUC 

GAAUAAAC CUGAUGAG GCCGUUAGGC CGAA AAGAUACU 

GAGGAAUA CUGAUGAG GCCGUUAGGC CGAA ACAAAGAU 

GGAGGAAU CUGAUGAG GCCGUUAGGC CGAA AACAAAGA 

UGGAGGAA CUGAUGAG GCCGUUAGGC CGAA AAACAAAG 

UGUGGAGG CUGAUGAG GCCGUUAGGC CGAA AUAAACAA 

CUGUGGAG CUGAUGAG GCCGUUAGGC CGAA AAUAAACA 

AGUCUGUG CUGAUGAG GCCGUUAGGC CGAA AGGAAUAA 

CUCUGGCG CUGAUGAG GCCGUUAGGC CGAA AGUCUGUG 

AUCAGGAC CUGAUGAG GCCGUUAGGC CGAA AGGUGUCU 

UUCAUCAG CUGAUGAG GCCGUUAGGC CGAA ACUAGGUG 

AAGGAGCA CUGAUGAG GCCGUUAGGC CGAA ACGUUUCA 

AGGACAAG CUGAUGAG GCCGUUAGGC CGAA AGCAGACG 

AUUAGGAC CUGAUGAG GCCGUUAGGC CGAA AGGAGCAG 

AAUAUUAG CUGAUGAG GCCGUUAGGC CGAA ACAAGGAG 



CUUGUCCU A AUAUUCAU 
" GUCCUAAU A UUCAUAUC ' 
" CCUAAUAU U CAUAUCAA " 
CUAAUAUU C AUAUCAAC 
AUAUUCAU A UCAACAGC 
" AUUCAUAU C AACAGCAC " 
" AGCACCAU U CCUGGCAU " 
" GCACCAUU C CUGGCAUU " 
" CCUGGCAU U CACAUUUU " 
" CUGGCAUU C ACAUUUUA " 
" AUUCACAU U UUAAAAAU " 
~ UUCACAUU U UAAAAAUU " 
" UCACAUUU U AAAAAUUA " 
' CACAUUUU A AAAAUUAU " 
" UUAAAAAU U AUGUGGAA " 
" UAAAAAUU A UGUGGAAG ' 
" AAGUGGAU A GGAGAACU 
" GCAGCUGU C AAUAGCCU " 
" CUGUCAAU A GCCUAGGG " 
" AAUAGCCU A GGGCUGAA " 



AUGAAUAU CUGAUGAG GCCGUUAGGC CGAA AGGACAAG 

GAUAUGAA CUGAUGAG GCCGUUAGGC CGAA AUUAGGAC 

UUGAUAUG CUGAUGAG GCCGUUAGGC CGAA AUAUUAGG 

GUUGAUAU CUGAUGAG GCCGUUAGGC CGAA AAUAUUAG 

GCUGUUGA CUGAUGAG GCCGUUAGGC CGAA AUGAAUAU 

GUGCUGUU CUGAUGAG GCCGUUAGGC CGAA AUAUGAAU 

AUGCCAGG CUGAUGAG GCCGUUAGGC CGAA AUGGUGCU 

AAUGCCAG CUGAUGAG GCCGUUAGGC CGAA AAUGGUGC 

AAAAUGUG CUGAUGAG GCCGUUAGGC CGAA AUGCCAGG 

UAAAAUGU CUGAUGAG GCCGUUAGGC CGAA AAUGCCAG 

AUUUUUAA CUGAUGAG GCCGUUAGGC CGAA AUGUGAAU 

AAUUUUUA CUGAUGAG GCCGUUAGGC CGAA AAUGUGAA 

UAAUUUUU CUGAUGAG GCCGUUAGGC CGAA AAAUGUGA 

AUAAUUUU CUGAUGAG GCCGUUAGGC CGAA AAAAUGUG 

UUCCACAU CUGAUGAG GCCGUUAGGC CGAA AUUUUUAA 

CUUCCACA CUGAUGAG GCCGUUAGGC CGAA AAUUUUUA 

AGUUCUCC CUGAUGAG GCCGUUAGGC CGAA AUCCACUU 

AGGCUAUU CUGAUGAG GCCGUUAGGC CGAA ACAGCUGC 

CCCUAGGC CUGAUGAG GCCGUUAGGC CGAA AUUGACAG 

UUCAGCCC CUGAUGAG GCCGUUAGGC CGAA AGGCUAUU 



GGCUGAAU U UUUGUCAG 



CUGACAAA CUGAUGAG GCCGUUAGGC CGAA AUUCAGCC 



GCUGAAUU U UUGUCAGA 



UCUGACAA CUGAUGAG GCCGUUAGGC CGAA AAUUCAGC 



CUGAAUUU U UGUCAGAU 



AUCUGACA CUGAUGAG GCCGUUAGGC CGAA AAAUUCAG 



UGAAUUUU U GUCAGAUA 



UAUCUGAC CUGAUGAG GCCGUUAGGC CGAA AAAAUUCA 



AUUUUUGU C AGAUAAAU 



AUUUAUCU CUGAUGAG GCCGUUAGGC CGAA ACAAAAAU 



UGUCAGAU A AAUAAAAU 



AUUUUAUU CUGAUGAG GCCGUUAGGC CGAA AUCUGACA 



AGAUAAAU A AAAUAAAU 



AUUUAUUU CUGAUGAG GCCGUUAGGC CGAA AUUUAUCU 



AAUAAAAU A AAUCAUUC 



GAAUGAUU CUGAUGAG GCCGUUAGGC CGAA AUUUUAUU 



AAAUAAAU C AUUCAUCC 



GGAUGAAU CUGAUGAG GCCGUUAGGC CGAA AUUUAUUU 



UAAAUCAU U CAUCCUUU 



AAAGGAUG CUGAUGAG GCCGUUAGGC CGAA AUGAUUUA 



AAAUCAUU C AUCCUUUU 



AAAAGGAU CUGAUGAG GCCGUUAGGC CGAA AAUGAUUU 



UCAUUCAU C CUUUUUUU 



AAAAAAAG CUGAUGAG GCCGUUAGGC CGAA AUGAAUGA 



UUCAUCCU U UUUUUGAU 



AUCAAAAA CUGAUGAG GCCGUUAGGC CGAA AGGAUGAA 



UCAUCCUU U UUUUGAUU 



AAUCAAAA CUGAUGAG GCCGUUAGGC CGAA AAGGAUGA 



CAUCCUUU U UUUGAUUA 



UAAUCAAA CUGAUGAG GCCGUUAGGC CGAA AAAGGAUG 



AUCCUUUU U UUGAUUAU 



AUAAUCAA CUGAUGAG GCCGUUAGGC CGAA AAAAGGAU 



UCCUUUUU U UGAUUAUA 



UAUAAUCA CUGAUGAG GCCGUUAGGC CGAA AAAAAGGA 



CCUUUUUU U GAUUAUAA 



UUAUAAUC CUGAUGAG GCCGUUAGGC CGAA AAAAAAGG 



UUUUUGAU U AUAAAAUU 



AAUUUUAU CUGAUGAG GCCGUUAGGC CGAA AUCAAAAA 



UUUUGAUU A UAAAAUUU 



AAAUUUUA CUGAUGAG GCCGUUAGGC CGAA AAUCAAAA 



UUGAUUAU A AAAUUUUC 



GAAAAUUU CUGAUGAG GCCGUUAGGC CGAA AUAAUCAA 



3156 
3157 



UAUAAAAU U UUCUAAAA 



UUUUAGAA CUGAUGAG GCCGUUAGGC CGAA AUUUUAUA 



AUAAAAUU U UCUAAAAU 



AUUUUAGA CUGAUGAG GCCGUUAGGC CGAA AAUUUUAU 



UAAAAUUU U CUAAAAUG 



CAUUUUAG CUGAUGAG GCCGUUAGGC CGAA AAAUUUUA 



AAAAUUUU C UAAAAUGU 



ACAUUUUA CUGAUGAG GCCGUUAGGC CGAA AAAAUUUU 



3172 
3262 
3173 
3263 
3178 



AAUUUUCU A AAAUGUAU 
" UAAAAUGU A UUUUAGAC " 
" AAAUGUAU U UUAGACUU " 
AAAUGUAU U UUAGACUU " 
" AAUGUAUU U UAGACUUC " 
" AAUGUAUU U UAGACUUC " 
AUGUAUUU U AGACUUCC " 
AUGUAUUU U AGACUUCC 
UGUAUUUU A GACUUCCU ~ 
UGUAUUUU A GACUUCCU 
UUUAGACU U CCUGUAGG 



AUACAUUU CUGAUGAG GCCGUUAGGC CGAA AGAAAAUU 
GUCUAAAA CUGAUGAG GCCGUUAGGC CGAA ACAUUUUA 
AAGUCUAA CUGAUGAG GCCGUUAGGC CGAA AUACAUUU 
AAGUCUAA CUGAUGAG GCCGUUAGGC CGAA AUACAUUU 
GAAGUCUA CUGAUGAG GCCGUUAGGC CGAA AAUACAUU 
GAAGUCUA CUGAUGAG GCCGUUAGGC CGAA AAUACAUU 
GGAAGUCU CUGAUGAG GCCGUUAGGC CGAA AAAUACAU 
GGAAGUCU CUGAUGAG GCCGUUAGGC CGAA AAAUACAU 
AGGAAGUC CUGAUGAG GCCGUUAGGC CGAA AAAAUACA 
AGGAAGUC CUGAUGAG GCCGUUAGGC CGAA AAAAUACA 
CCUACAGG CUGAUGAG GCCGUUAGGC CGAA AGUCUAAA 



UUUAGACU U CCUGUAGG 



CCUACAGG CUGAUGAG GCCGUUAGGC CGAA AGUCUAAA 



UUAGACUU C CUGUAGGG 



CCCUACAG CUGAUGAG GCCGUUAGGC CGAA AAGUCUAA 



UUAGACUU C CUGUAGGG 



CCCUACAG CUGAUGAG GCCGUUAGGC CGAA AAGUCUAA 



CUUCCUGU A GGGGGCGA 



UCGCCCCC CUGAUGAG GCCGUUAGGC CGAA ACAGGAAG 



CUUCCUGU A GGGGGCGA 



UCGCCCCC CUGAUGAG GCCGUUAGGC CGAA ACAGGAAG 



GGGGCGAU A UACUAAAU 



AUUUAGUA CUGAUGAG GCCGUUAGGC CGAA AUCGCCCC 



GGGGCGAU A UACUAAAU 



AUUUAGUA CUGAUGAG GCCGUUAGGC CGAA AUCGCCCC 



GGCGAUAU A CUAAAUGU 



ACAUUUAG CUGAUGAG GCCGUUAGGC CGAA AUAUCGCC 



GGCGAUAU A CUAAAUGU 



ACAUUUAG CUGAUGAG GCCGUUAGGC CGAA AUAUCGCC 



GAUAUACU A AAUGUAUA 



UAUACAUU CUGAUGAG GCCGUUAGGC CGAA AGUAUAUC 



CUAAAUGU A UAUAGUAC 



GUACUAUA CUGAUGAG GCCGUUAGGC CGAA ACAUUUAG 



AAAUGUAU A UAGUACAU 



AUGUACUA CUGAUGAG GCCGUUAGGC CGAA AUACAUUU 



AUGUAUAU A GUACAUUU 



AAAUGUAC CUGAUGAG GCCGUUAGGC CGAA AUAUACAU 



UAUAUAGU A CAUUUAUA 



UAUAAAUG CUGAUGAG GCCGUUAGGC CGAA ACUAUAUA 



UAGUACAU U UAUACUAA 



UUAGUAUA CUGAUGAG GCCGUUAGGC CGAA AUGUACUA 



AGUACAUU U AUACUAAA 



UUUAGUAU CUGAUGAG GCCGUUAGGC CGAA AAUGUACU 



GUACAUUU A UACUAAAU 



AUUUAGUA CUGAUGAG GCCGUUAGGC CGAA AAAUGUAC 



ACAUUUAU A CUAAAUGU 



ACAUUUAG CUGAUGAG GCCGUUAGGC CGAA AUAAAUGU 



UUUAUACU A AAUGUAUU 



AAUACAUU CUGAUGAG GCCGUUAGGC CGAA AGUAUAAA 



CUAAAUGU A UUCCUGUA 



UACAGGAA CUGAUGAG GCCGUUAGGC CGAA ACAUUUAG 



AAAUGUAU U CCUGUAGG 



CCUACAGG CUGAUGAG GCCGUUAGGC CGAA AUACAUUU 



AAUGUAUU C CUGUAGGG 



CCCUACAG CUGAUGAG GCCGUUAGGC CGAA AAUACAUU 



AUUCCUGU A GGGGGCGA 



UCGCCCCC CUGAUGAG GCCGUUAGGC CGAA ACAGGAAU 



GAUAUACU A AAUGUAUU 



AAUACAUU CUGAUGAG GCCGUUAGGC CGAA AGUAUAUC 



CUAAAUGU A UUUUAGAC 



GUCUAAAA CUGAUGAG GCCGUUAGGC CGAA ACAUUUAG 



GGGGCGAU A AAAUAAAA 



UUUUAUUU CUGAUGAG GCCGUUAGGC CGAA AUCGCCCC 



3289 
3297 



GAUAAAAU A AAAUGCUA 



UAGCAUUU CUGAUGAG GCCGUUAGGC CGAA AUUUUAUC 



AAAAUGCU A AACAACUG 



CAGUUGUU CUGAUGAG GCCGUUAGGC CGAA AGCAUUUU 



Input Sequence = NM_001285. Cut Site = UH/. 

Arm Length = 8. Core Sequence = CUGAUGAG GCCGUUAGGC CGAA 
Underlined region can be any X sequence or linker, as described herein. 
NM 001285 (Homo sapiens chloride channel, calcium activated, 1 (CLCA1) 
mRNA, 3311 bp) 




S o 



15 



si 



si 



!! 



11 

i 



m 



HrHrHrHHHiHiHrHrHrHrH 



3101 1 


3102 1 


3103 


3104 1 


3105 


3106 1 


3107 1 


3108 1 


3109 1 


3110 


3111 1 


3112 1 


3113 | 


3114 1 


3115 1 


3116 


3117 


3118 1 


3119 1 


3120 1 


3121 1 


3122 1 


3123 1 


3124 


3125 


3126 


3127 1 


3128 1 


3129 


3130 


3131 1 


3132 1 


3133 1 


3134 | 


GCUGUUCU CUGAUGAG GCCGUUAGGC CGAA lAGGGUUA 


GGCUGUUC CUGAUGAG GCCGUUAGGC CGAA IGAGGGUU 


CCACUGGC CUGAUGAG GCCGUUAGGC CGAA lUUCUGGA 


CAUCCACU CUGAUGAG GCCGUUAGGC CGAA ICUGUUCU 


UCAUCCAC CUGAUGAG GCCGUUAGGC CGAA IGCUGUUC 


GAUCACUG CUGAUGAG GCCGUUAGGC CGAA ICCAUUCA 


ACGAUCAC CUGAUGAG GCCGUUAGGC CGAA lUGCCAUU 


CACGGUGC CUGAUGAG GCCGUUAGGC CGAA lUCCACGA 


UCCCACGG CUGAUGAG GCCGUUAGGC CGAA ICUGUCCA 


UUUCCCAC CUGAUGAG GCCGUUAGGC CGAA lUGCUGUC 


AAACAAAG CUGAUGAG GCCGUUAGGC CGAA lUCCUUUC 


AGAAACAA CUGAUGAG GCCGUUAGGC CGAA lUGUCCUU 


AGGUGAUA CUGAUGAG GCCGUUAGGC CGAA lAAACAAA 


UGUCCAGG CUGAUGAG GCCGUUAGGC CGAA lAUAAGAA 


GUUGUCCA CUGAUGAG GCCGUUAGGC CGAA lUGAUAAG 


CGUUGUCC CUGAUGAG GCCGUUAGGC CGAA IGUGAUAA 


GGCUGCGU CUGAUGAG GCCGUUAGGC CGAA lUCCAGGU 


GGGGAGGC CUGAUGAG GCCGUUAGGC CGAA ICGUUGUC 


UUUGGGGA CUGAUGAG GCCGUUAGGC CGAA ICUGCGUU 


AUUUGGGG CUGAUGAG GCCGUUAGGC CGAA IGCUGCGU 


GGAUUUGG CUGAUGAG GCCGUUAGGC CGAA lAGGCUGC 


AGGAUUUG CUGAUGAG GCCGUUAGGC CGAA IGAGGCUG 


AAGGAUUU CUGAUGAG GCCGUUAGGC CGAA IGGAGGCU j 


GAAGGAUU CUGAUGAG GCCGUUAGGC CGAA IGGGAGGC 


CCAGAGAA CUGAUGAG GCCGUUAGGC CGAA lAUUUGGG 


CCCAGAGA CUGAUGAG GCCGUUAGGC CGAA IGAUUUGG 


GAUCCCAG CUGAUGAG GCCGUUAGGC CGAA lAAGGAUU 


GGGAUCCC CUGAUGAG GCCGUUAGGC CGAA lAGAAGGA 1 


GUCCACUG CUGAUGAG GCCGUUAGGC CGAA lAUCCCAG | 


UGUCCACU CUGAUGAG GCCGUUAGGC CGAA IGAUCCCA | 


CUGUCCAC CUGAUGAG GCCGUUAGGC CGAA IGGAUCCC 1 


CUUGCUUC CUGAUGAG GCCGUUAGGC CGAA lUCCACUG | 


AGCCACCU CUGAUGAG GCCGUUAGGC CGAA ICUUCUGU | 


CACUACAA CUGAUGAG GCCGUUAGGC CGAA ICCACCUU | 






H 

CTl 






t- 














































<i 


il 






^ 




UAACCCUC C AGAACAGC 


AACCCUCC A GAACAGCC 


UCCAGAAC A GCCAGUGG 


AGAACAGC C AGUGGAUG 


GAACAGCC A GUGGAUGA 


UGAAUGGC A CAGUGAUC 


AAUGGCAC A GUGAUCGU 


UCGUGGAC A GCACCGUG 


UGGACAGC A CCGUGGGA 


GACAGCAC C GUGGGAAA 


GAAAGGAC A CUUUGUUU 


AAGGACAC U UUGUUUCU 


UUUGUUUC U UAUCACCU 


UUCUUAUC A CCUGGACA 


CUUAUCAC C UGGACAAC 


UUAUCACC U GGACAACG 


ACCUGGAC A ACGCAGCC 


GACAACGC A GCCUCCCC 


AACGCAGC C UCCCCAAA 


ACGCAGCC U CCCCAAAU 


GCAGCCUC C CCAAAUCC 1 


CAGCCUCC C CAAAUCCU 1 


AGCCUCCC C AAAUCCUU 1 


GCCUCCCC A AAUCCUUC 1 


CCCAAAUC C UUCUCUGG 1 


CCAAAUCC U UCUCUGGG 1 


AAUCCUUC U CUGGGAUC j 


UCCUUCUC U GGGAUCCC 1 


CUGGGAUC C CAGUGGAC 1 


UGGGAUCC C AGUGGACA 1 


GGGAUCCC A GUGGACAG | 


CAGUGGAC A GAAGCAAG | 


ACAGAAGC A AGGUGGCU | 


AAGGUGGC U UUGUAGUG | 


1 1840 


1 1841 


1846 


1849 


1850 


[ 1864 


1866 


1879 


1882 


1884 


1897 


1899 


1907 


1912 


1914 


1915 


1920 


1925 


1928 


1929 


1931 


1932 


1933 


1934 


1939 1 


1940 1 


1943 1 


1945 | 


1952 


1953 


1954 | 


1961 | 


1967 


1975 | 



3203 1 


3204 | 


3205 | 


3206 


3207 


3208 


3209 


3210 | 


3211 


3212 


3213 


3214 


3215 


3216 


3217 


3218 


3219 j 


3220 


3221 


3222 


3223 


3224 


3225 


3226 


3227 1 


3228 


3229 


3230 


3231 


3232 


3233 


3234 


3235 1 


3236 1 


CUGAGAAU CUGAUGAG GCCGUUAGGC CGAA IGGGAGGC | 


UGGCCCUG CUGAUGAG GCCGUUAGGC CGAA lAAUUGGG | 


ACUGGCCC CUGAUGAG GCCGUUAGGC CGAA lAGAAUUG | 


GUGACACU CUGAUGAG GCCGUUAGGC CGAA ICCCUGAG 1 


UGUGACAC CUGAUGAG GCCGUUAGGC CGAA IGCCCUGA | 


CAGGGCUG CUGAUGAG GCCGUUAGGC CGAA lACACUGG | 


AUCAGGGC CUGAUGAG GCCGUUAGGC CGAA lUGACACU 1 


UCAAUCAG CUGAUGAG GCCGUUAGGC CGAA ICUGUGAC | 


UUCAAUCA CUGAUGAG GCCGUUAGGC CGAA IGCUGUGA 1 


AUUCAAUC CUGAUGAG GCCGUUAGGC CGAA IGGCUGUG 1 


CCAUUCAC CUGAUGAG GCCGUUAGGC CGAA lAUUCAAU | 


AAGGUAAC CUGAUGAG GCCGUUAGGC CGAA lUUUUUCC | 


AGUUCCAA CUGAUGAG GCCGUUAGGC CGAA lUAACUGU | 


UAGUUCCA CUGAUGAG GCCGUUAGGC CGAA IGUAACUG | 


UAUCCAGU CUGAUGAG GCCGUUAGGC CGAA lUUCCAAG | 


CAUUAUCC CUGAUGAG GCCGUUAGGC CGAA lUAGUUCC | 


UCAGCACC CUGAUGAG GCCGUUAGGC CGAA ICUCCAUU 


GUAGCAUC CUGAUGAG GCCGUUAGGC CGAA ICACCUGC 


UCCUUAGU CUGAUGAG GCCGUUAGGC CGAA ICAUCAGC 


UCAUCCUU CUGAUGAG GCCGUUAGGC CGAA lUAGCAUC 


CCUUGAGU CUGAUGAG GCCGUUAGGC CGAA lACACCGU 


AUACCUUG CUGAUGAG GCCGUUAGGC CGAA lUAGACAC | 


AAAUACCU CUGAUGAG GCCGUUAGGC CGAA lAGUAGAC 


AUAAGUUG CUGAUGAG GCCGUUAGGC CGAA IAAAUACC 


UCAUAAGU CUGAUGAG GCCGUUAGGC CGAA lUGAAAUA 


GUGUCAUA CUGAUGAG GCCGUUAGGC CGAA lUUGUGAA 


ACCAUUCG CUGAUGAG GCCGUUAGGC CGAA lUCAUAAG 


UUUUACAC CUGAUGAG GCCGUUAGGC CGAA lUAUCUAC 


CCUCCCAG CUGAUGAG GCCGUUAGGC CGAA ICCCGCAC 


CUCCUCCC CUGAUGAG GCCGUUAGGC CGAA lAGCCCGC 


CGUCUGGC CUGAUGAG GCCGUUAGGC CGAA ICGUUAAC 


CUCCGUCU CUGAUGAG GCCGUUAGGC CGAA ICUGCGUU 


UCUCCGUC CUGAUGAG GCCGUUAGGC CGAA IGCUGCGU 


UCUGCUGG CUGAUGAG GCCGUUAGGC CGAA lUAUCACU 


1014 I 


1015 1 


1016 


1017 


1018 


1019 


1020 


1021 


1022 


1023 


1024 


1025 


1026 


1027 


1028 


1029 


1030 


1031 


1032 


1033 


1034 


1035 


1036 


1037 


1038 


1039 


1040 


1041 


1042 


1043 


1044 


1045 


1046 


1047 


GCCUCCCC A AUUCUCAG j 


CCCAAUUC U CAGGGCCA 


CAAUUCUC A GGGCCAGU 


CUCAGGGC C AGUGUCAC 


UCAGGGCC A GUGUCACA 


CCAGUGUC A CAGCCCUG 


AGUGUCAC A GCCCUGAU 


GUCACAGC C CUGAUUGA 


UCACAGCC C UGAUUGAA 


CACAGCCC U GAUUGAAU 1 


AUUGAAUC A GUGAAUGG 


GGAAAAAC A GUUACCUU 


ACAGUUAC C UUGGAACU 


CAGUUACC U UGGAACUA 


CUUGGAAC U ACUGGAUA 


GGAACUAC U GGAUAAUG 


AAUGGAGC A GGUGCUGA 


GCAGGUGC U GAUGCUAC 


GCUGAUGC U ACUAAGGA 


GAUGCUAC U AAGGAUGA 


ACGGUGUC U ACUCAAGG 


GUGUCUAC U CAAGGUAU 


GUCUACUC A AGGUAUUU 


GGUAUUUC A CAACUUAU 


UAUUUCAC A ACUUAUGA 


UUCACAAC U UAUGACAC 


CUUAUGAC A CGAAUGGU 1 


GUAGAUAC A GUGUAAAA 


GUGCGGGC U CUGGGAGG 


GCGGGCUC U GGGAGGAG 


GUUAACGC A GCCAGACG 


AACGCAGC C AGACGGAG 


ACGCAGCC A GACGGAGA 


AGUGAUAC C CCAGCAGA j 


2211 1 


2216 


2218 


2223 


2224 


2230 


2232 


2235 


2236 


2237 


2247 


2262 


2268 


2269 


2276 


2279 


2292 


2298 


2304 


2307 


2323 


2326 


2328 


2338 


2340 


2343 


2350 


2365 


2382 


2384 


2400 


2403 


2404 


2420 



HlSSSifSKfSlI 




RzSeq 
ID No. 


3400 


3401 


3402 


3403 


3404 


3405 


3406 


3407 


3408 


3409 


3410 


3411 


3412 


3413 


3414 


3415 


3416 


3417 


3418 


3419 


3420 


342M 


3422 


3423 


3424 


3425 


3426 


| 3427 | 


Ribozyme 


AAAAUAUU UGAUG GCAUGCACUAUGC GCG AAUUAUAU 1 


CACCUCUU UGAUG GCAUGCACUAUGC GCG AUGCUCCC I 


CAUAACCU UGAUG GCAUGCACUAUGC GCG AACACCUC 1 


UCUGCCUU UGAUG GCAUGCACUAUGC GCG AGCUGUGC 1 


UCAAAUUG UGAUG GCAUGCACUAUGC GCG GUACUUGU 1 


CUUAGUCU UGAUG GCAUGCACUAUGC GCG AAAUUGCG 1 


CUUGUCUU UGAUG GCAUGCACUAUGC GCG AAUAGGAG 1 


UGGUUUAU UGAUG GCAUGCACUAUGC GCG ACAGGUCU 1 


CAACUUAU UGAUG GCAUGCACUAUGC GCG GGAAGUGG 1 


GGAAAAUG UGAUG GCAUGCACUAUGC GCG GGGUUACG 1 


UGAAGAAU UGAUG GCAUGCACUAUGC GCG AAGAUGAA 1 


GAAUUACU UGAUG GCAUGCACUAUGC GCG AGGGCCCC 1 


UUGUUGUU UGAUG GCAUGCACUAUGC GCG AGCUGAAU 


| AAUGCCUU UGAUG GCAUGCACUAUGC GCG AUAGCCAU 


GUCGAUUG UGAUG GCAUGCACUAUGC GCG AACGACAA 


AUUGGGGU UGAUG GCAUGCACUAUGC GCG GAUUGCAA 


| UCUUCUGG UGAUG GCAUGCACUAUGC GCG ACAUUGGG 


| GAGUGUUU UGAUG GCAUGCACUAUGC GCG AUCUUCUG 


| GCCUGGGU UGAUG GCAUGCACUAUGC GCG ACCAUGUC 


| UGUAGCUU UGAUG GCAUGCACUAUGC GCG AAACAGAU 


| AAUAAAAU UGAUG GCAUGCACUAUGC GCG GCUUUCCU 


| CAAAAUGG UGAUG GCAUGCACUAUGC GCG AACAUUUU 


| UCAGGAAU UGAUG GCAUGCACUAUGC GCG AAAAUGGC 


| CCAUGUUU UGAUG GCAUGCACUAUGC GCG AGGAAUCA 


| CACAUAGU UGAUG GCAUGCACUAUGC GCG AGCCUUUG 


| UUUGGUCU UGAUG GCAUGCACUAUGC GCG ACAUAGUC 


| GUAGGUCU UGAUG GCAUGCACUAUGC GCG AAGUUUUG 


J" AACAUCAG UGAUG GCAUGCACUAUGC GCG AUUUUUGU 


SeqID 
No. 


1211 


1212 


1213 


1214 


1215 


1216 


1217 


1218 


1219 


1220 


1221 


1222 


1223 


1224 


1225 


1226 


1227 


1228 


1229 


1230 


1231 


1232 


| 1233 


1 1234 


1 1235 


1236 


1237 


1 1238 


Substrate 


AUAUAAUU G AAUAUUUU 


GGGAGCAU G AAGAGGUG 


GAGGUGUU G AGGUUAUG 


GCACAGCU G AAGGCAGA 


ACAAGUAC G CAAUUUGA 


CGCAAUUU G AGACUAAG 


CUCCUAUU G AAGACAAG 


AGACCUGU G AUAAACCA 


CCACUUCC G AUAAGUUG 


CGUAACCC G CAUUUUCC 


UUCAUCUU G AUUCUUCA 


GGGGCCCU G AGUAAUUC 


AUUCAGCU G AACAACAA 


AUGGCUAU G AAGGCAUU 


UUGUCGUU G CAAUCGAC 


UUGCAAUC G ACCCCAAU 


CCCAAUGU G CCAGAAGA 


CAGAAGAU G AAACACUC 


GACAUGGU G ACCCAGGC 


AUCUGUUU G AAGCUACA 


AGGAAAGC G AUUUUAUU 


AAAAUGUU G CCAUUUUG 


GCCAUUUU G AUUCCUGA 


UGAUUCCU G AAACAUGG 


CAAAGGCU G ACUAUGUG 


GACUAUGU G AGACCAAA 


CAAAACUU G AGACCUAC 


| ACAAAAAU G CUGAUGUU 


Pos 












H 












i- 
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H 




cr 




o 





HI h 



3559 
3560 
3561 


3562 
3563 
3564 


n 




3570 
3571 
3572 


§§§ 


3576 
3577 
3578 


3579 
3580 
3581 


3582 
3583 
3584 


ill 


Ills 


AUGCUUGA UGAUG GCAUGCACUAUGC GCG AUAACCUC 
AAUGAUAA UGAUG GCAUGCACUAUGC GCG AAUAUCUU 
GUUUAUCA UGAUG GCAUGCACUAUGC GCG AGGUCUUU 
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1509 1 


1510 1 


1511 


1512 


1513 


1514 


1515 


1516 


1517 


1518 


1519 


1520 


1521 


1522 


1523 


1524 


1525 


1526 


1527 


1528 


1529 


1530 


1531 


1532 


1533 


1534 


1535 


1536 


1537 


1538 


1539 


1540 


1541 


< 

1 

o 

B 


CAGUGGCA G UGACAGGG 


UACCUGCA G CAGCUUCA 


CUGCAGCA G CUUCAGGA 


GGAGGGAC G UCCAUCUG 1 


CAUCUGCA G CGGGCUUC 


UGCAGCGG G CUUCGAUC 


UUCGAUCG G CAUUUACU 1 


CACUAUAA G UGGGUGCU 1 


AUAAGUGG G UGCUUUAA 1 


UUAACGAG G UCAAACAA 


CAAACAAA G UGGUGCCA 


ACAAAGUG G UGCCAUCA 


UCCACACA G UCGCUUUG 


GCUUUGGG G CCCUCUGC 


CCUCUGCA G CUCAAGAA 


CUAGAGGA G CUGUCCAA 


GACAGGAG G UUUACAGA 


CAGAUCAA G UUCAGAAC 


GAACAAUG G CCUCAUUG 


CUUUUGGG G CCCUUUCA 


GAAAUGGA G CUGUCUCU 


GUCUCUCA G CGCUCCAU 


UCCAUCCA G CUUGAGAG 


GCUUGAGA G UAAGGGAU 


CCAGAACA G CCAGUGGA 


AACAGCCA G UGGAUGAA 


GAUGAAUG G CACAGUGA 


AUGGCACA G UGAUCGUG 


CAGUGAUC G UGGACAGC 


CGUGGACA G CACCGUGG 


ACAGCACC G UGGGAAAG 


ACAACGCA G CCUCCCCA 


1445 


1448 


1483 


1486 


1500 


1511 


1515 


1525 


1607 


1611 


1624 


1634 


1637 


1654 


1665 


1675 


1692 


1712 


1738 


1751 


1771 


1792 


1803 


1815 


1823 


1847 


1851 


1862 


1867 


1873 


1880 


1885 


1926 1 







n 


n 


n 


n 


Hi 


m 




a 


ii 


1 






3873 
3874 




UCUGUCCA GCCGAAAGGCGAGUGAGGUCU UGGGAUCC 
CCACCUUG GCCGAAAGGCGAGUGAGGUCU UUCUGUCC 


3 

I 

1 


1 
1 


UAGCAAUG GCCGAAAGGCGAGUGAGGUCU CUGGGAUU 


UCCAAGUG GCCGAAAGGCGAGUGAGGUCU CAACCUUA 


I 

li 


jj| 

1! 


| 

1! 


1 
11 


it 

5 8 E 

1 p 

1! 


GACACUGG GCCGAAAGGCGAGUGAGGUCU CCUGAGAA 


AAUCAGGG GCCGAAAGGCGAGUGAGGUCU UGUGACAC 


UCCAUUCA GCCGAAAGGCGAGUGAGGUCU UGAUUCAA 


AGCACCUG GCCGAAAGGCGAGUGAGGUCU UCCAUUAU 


AGUAGACA GCCGAAAGGCGAGUGAGGUCU CGUCAUCC 
GUGAAAUA GCCGAAAGGCGAGUGAGGUCU CUUGAGUA 


UGUAUCUA GCCGAAAGGCGAGUGAGGUCU CAUUCGUG 
CUUUUACA GCCGAAAGGCGAGUGAGGUCU UGUAUCUA 
AGCCCGCA GCCGAAAGGCGAGUGAGGUCU UUUUACAC 








m 




m 










111 








1570 
1571 


1572 
1573 
1574 


GGAUCCCA G UGGACAGA 
GGACAGAA G CAAGGUGG 


u 

3 CD 
3 CD 


cd c 


AAUCCCAG G CAUUGCUA 


ill 

I 


GUCUGCAA G CAAGCUCA 
GCAAGCAA G CUCACAAA 


ACUGUCAC G UCCCGUGC 
CACGUCCC G UGCGUCCA 


CAAUUACA G UGACUUCC 


AUUCCCCA G CCCUCUGG 


Ill 

iii 


II 

iii 


ii 

ii 


II 
F 


II 

3 CD 

II 


ill 

3 CD CD 

ill 


CACGAAUG G UAGAUACA 
UAGAUACA G UGUAAAAG 
GUGUAAAA G UGCGGGCU 


1955 
1965 


5§ 


i\ 




n\ 


Hi 


2091 
2096 




II 








I 


ii 


2318 
2331 


2357 
2366 



3878 I 


3879 


3880 


3881 


3882 j 


3883 


3884 


3885 


3886 


3887 


3888 


3889 


3890 


3891 


3892 


3893 


3894 


3895 


3896 


3897 


3898 


3899 


3900 


3901 


3902 


3903 


3904 1 


3905 


3906 


3907 


3908 


3909 j 


3910 


UCCCAGAG GCCGAAAGGCGAGUGAGGUCU CCGCACUU | 


UGCGUUAA GCCGAAAGGCGAGUGAGGUCU UCCUCCCA | 


CCGUCUGG GCCGAAAGGCGAGUGAGGUCU UGCGUUAA 1 


GGGUAUCA GCCGAAAGGCGAGUGAGGUCU UCUCCGUC 1 


CCACUCUG GCCGAAAGGCGAGUGAGGUCU UGGGGUAU | 


GUGCUCCA GCCGAAAGGCGAGUGAGGUCU UCUGCUGG 1 


GUACAGUG GCCGAAAGGCGAGUGAGGUCU UCCACUCU 1 


CAAUCCAG GCCGAAAGGCGAGUGAGGUCU CAGGUAUG | 


CACACUUG GCCGAAAGGCGAGUGAGGUCU UUGUGUUG | 


GAAACACA GCCGAAAGGCGAGUGAGGUCU UUGCUUGU | 


AUGUUCUG GCCGAAAGGCGAGUGAGGUCU UGAAACAC 


CAAAUGAG GCCGAAAGGCGAGUGAGGUCU CUCCCGAG 


AUCAGAAG GCCGAAAGGCGAGUGAGGUCU CACAAAUG 


UGAUUUGG GCCGAAAGGCGAGUGAGGUCU CAGGUGGG 


AAUUUCCG GCCGAAAGGCGAGUGAGGUCU CUUCAGGU 


UGAGACUG GCCGAAAGGCGAGUGAGGUCU CCCCGUGA 


UAAUGAGA GCCGAAAGGCGAGUGAGGUCU UGCCCCCG 


CCCAGGAG GCCGAAAGGCGAGUGAGGUCU UGUCCAAG 


CUUGUGAG GCCGAAAGGCGAGUGAGGUCU UGUUCCAU 


AUGAUAUA GCCGAAAGGCGAGUGAGGUCU UUGUGAGC 


UACUUGUA GCCGAAAGGCGAGUGAGGUCU UUAUUCGA 


CAAGAAUA GCCGAAAGGCGAGUGAGGUCU UUGUACUU 


UCAUUGAA GCCGAAAGGCGAGUGAGGUCU UUGUCUCU 


AGUAUUCA GCCGAAAGGCGAGUGAGGUCU UUGAAGAG 


AGAGUUGG GCCGAAAGGCGAGUGAGGUCU UUCCUUUG 


CAAAAAGA GCCGAAAGGCGAGUGAGGUCU UUCCUCAG 


GAUCUGUG GCCGAAAGGCGAGUGAGGUCU CAUUUUCA 


AUCAACAG GCCGAAAGGCGAGUGAGGUCU CUGAAUAG 


CAGAUCGA GCCGAAAGGCGAGUGAGGUCU CUUAUCAA 


CAAAGAUA GCCGAAAGGCGAGUGAGGUCU UCGUGCAA 


CAUCAGGA GCCGAAAGGCGAGUGAGGUCU UAGGUGUC 


GGAGCAGA GCCGAAAGGCGAGUGAGGUCU GUUUCAUC 


GAAUGGUG GCCGAAAGGCGAGUGAGGUCU UGUUGAUA 


1575 


1576 


1577 


1578 1 


1579 1 


1580 


1581 1 


1582 1 


1583 1 


1584 1 


1585 


1586 


1587 


1588 


1589 


1590 


1591 


1592 


1593 


1594 


1595 


1596 


1597 


1598 


1599 


1600 


1601 


1602 


1603 


1604 


1605 


1606 


1607 


AAGUGCGG G CUCUGGGA 1 


UGGGAGGA G UUAACGCA 


UUAACGCA G CCAGACGG 


GACGGAGA G UGAUACCC 1 


AUACCCCA G CAGAGUGG I 


CCAGCAGA G UGGAGCAC 


AGAGUGGA G CACUGUAC 


CAUACCUG G CUGGAUUG 


CAACACAA G CAAGUGUG 


ACAAGCAA G UGUGUUUC 


GUGUUUCA G CAGAACAU 


CUCGGGAG G CUCAUUUG 


CAUUUGUG G CUUCUGAU 


CCCACCUG G CCAAAUCA 


ACCUGAAG G CGGAAAUU 


UCACGGGG G CAGUCUCA 


CGGGGGCA G UCUCAUUA 


CUUGGACA G CUCCUGGG 


AUGGAACA G CUCACAAG 


GCUCACAA G UAUAUCAU 


UCGAAUAA G UACAAGUA 


AAGUACAA G UAUUCUUG 


AGAGACAA G UUCAAUGA 


CUCUUCAA G UGAAUACU 


CAAAGGAA G CCAACUCU 


CUGAGGAA G UCUUUUUG 


UGAAAAUG G CACAGAUC 


CUAUUCAG G CUGUUGAU 


UUGAUAAG G UCGAUCUG 


UUGCACGA G UAUCUUUG 


GACACCUA G UCCUGAUG 


GAUGAAAC G UCUGCUCC 


UAUCAACA G CACCAUUC 


2380 1 


2392 


2401 


2413 


2424 


2429 


2434 


2450 


2523 


2527 


2537 


2555 


2566 


2612 


2632 


2648 


2651 


2674 


2704 


2712 


2729 


2735 


2757 


2776 


2806 


2821 


2861 


2887 


2899 


2935 


2978 


2991 


3023 | 



8 I 



o 3 " 
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Table VII: Human CLCA1 DNAzyme and Target Sequence 249.021 



Pos 


Substrate 


SeqID 
No 


DNAzyme 


Rz 
SeqID 
No 


17 


CUUUUGGU A CAAAUGGA 


4 


TCCATTTG GGCTAGCTACAACGA ACCAAAAG 


3919 


34 


UGUGGAAU A UAAUUGAA 


5 


TTCAATTA GGCTAGCTACAACGA ATTCCACA 


3920 


44 


AAUUGAAU A UUUUCUUG 


8 


CAAGAAAA GGCTAGCTACAACGA ATTCAATT 


3921 


84 


UUGAGGUU A UGUCAAGC 


19 


GCTTGACA GGCTAGCTACAACGA AACCTCAA 


3 922 


122 


AUGGAAAU A UUUACAAG 


22 


CTTGTAAA GGCTAGCTACAACGA ATTTCCAT 


3923 


126 


AAAUAUUU A CAAGUACG 


25 


CGTACTTG GGCTAGCTACAACGA AAATATTT 


3924 


132 


UUACAAGU A CGCAAUUU 


26 


AAATTGCG GGCTAGCTACAACGA ACTTGTAA 


3 925 


152 


ACUAAGAU A UUGUUAUC 


30 


GATAACAA GGCTAGCTACAACGA ATCTTAGT 


3926 


158 


AUAUUGUU A UCAUUCUC 


33 


GAGAATGA GGCTAGCTACAACGA AACAATAT 


3927 


169 


AUUCUCCU A UUGAAGAC 


38 


GTCTTCAA GGCTAGCTACAACGA AGGAGAAT 


3928 


259 


GUGUGUCU A UAUUUUCA 


52 


TGAAAATA GGCTAGCTACAACGA AGACACAC 


3929 


261 


GUGUCUAU A UUUUCAUA 


53 


TATGAAAA GGCTAGCTACAACGA ATAGACAC 


3930 


269 


AUUUUCAU A UCUGUAUA 


58 


TATACAGA GGCTAGCTACAACGA ATGAAAAT 


3931 


275 


AUAUCUGU A UAUAUAUA 


60 


TATATATA GGCTAGCTACAACGA ACAGATAT 


3932 


277 


AUCUGUAU A UAUAUAAU 


61 


ATTATATA GGCTAGCTACAACGA ATACAGAT 


3933 


279 


CUGUAUAU A UAUAAUGG 


62 


CCATTATA GGCTAGCTACAACGA ATATACAG 


3934 


281 


GUAUAUAU A UAAUGGUA 


63 


TACCATTA GGCTAGCTACAACGA ATATATAC 


3935 


346 


GGAGAUGU A CAGCAAUG 


74 


CATTGCTG GGCTAGCTACAACGA ACATCTCC 


3936 


446 


CAAUGGCU A UGAAGGCA 


97 


TGCCTTCA GGCTAGCTACAACGA AGCCATTG 


3937 


539 


AUCUCUGU A UCUGUUUG 


108 


CAAACAGA GGCTAGCTACAACGA ACAGAGAT 


3938 


553 


UUGAAGCU A CAGGAAAG 


112 


CTTTCCTG GGCTAGCTACAACGA AGCTTCAA 


3939 


569 


GCGAUUUU A UUUCAAAA 


116 


TTTTGAAA GGCTAGCTACAACGA AAAATCGC 


3940 


623 


GGCUGACU A UGUGAGAC 


126 


GTCTCACA GGCTAGCTACAACGA AGTCAGCC 


3941 


647 


UGAGACCU A CAAAAAUG 


128 


CATTTTTG GGCTAGCTACAACGA AGGTCTCA 


3942 


679 


CUGAGUCU A CUCCUCCA 


133 


TGGAGGAG GGCTAGCTACAACGA AGACTCAG 


3943 


704 


UGAACCCU A CACUGAGC 


137 


GCTCAGTG GGCTAGCTACAACGA AGGGTTCA 


3944 


791 


AGCUGAAU A UGGACCAC 


147 


GTGGTCCA GGCTAGCTACAACGA ATTCAGCT 


3945 


834 


GCUCAUCU A CGAUGGGG 


154 


CCCCATCG GGCTAGCTACAACGA AGATGAGC 


3946 


846 


UGGGGAGU A UUUGACGA 


155 


TCGTCAAA GGCTAGCTACAACGA ACTCCCCA 


3947 


857 


UGACGAGU A CAAUAAUG 


158 


CATTATTG GGCTAGCTACAACGA ACTCGTCA 


3948 


878 


GAAAUUCU A CUUAUCCA 


162 


TGGATAAG GGCTAGCTACAACGA AGAATTTC 


3949 


882 


UUCUACUU A UCCAAUGG 


164 


CCATTGGA GGCTAGCTACAACGA AAGTAGAA 


3950 


897 


GGAAGAAU A CAAGCAGU 


166 


ACTGCTTG GGCTAGCTACAACGA ATTCTTCC 


3951 


922 


CAGCAGGU A UUACUGGU 


170 


ACCAGTAA GGCTAGCTACAACGA ACCTGCTG 


3952 


925 


CAGGUAUU A CUGGUACA 


172 


TGTACCAG GGCTAGCTACAACGA AATACCTG 


3953 


931 


UUACUGGU A CAAAUGUA 


173 


TACATTTG GGCTAGCTACAACGA ACCAGTAA 


3954 


968 


CAGCUGUU A CACCAAAA 


178 


TTTTGGTG GGCTAGCTACAACGA AACAGCTG 


3955 


997 


AUAAAGUU A CAGGACUC 


183 


GAGTCCTG GGCTAGCTACAACGA AACTTTAT 


3956 


1007 


AGGACUCU A UGAAAAAG 


185 


CTTTTTCA GGCTAGCTACAACGA AGAGTCCT 


3957 


1060 


AGGCUUCU A UAAUGUUU 


194 


AAACATTA GGCTAGCTACAACGA AGAAGCCT 


3958 


1087 


UUGAUUCU A UAGUUGAA 


201 


TTCAACTA GGCTAGCTACAACGA AGAATCAA 


3959 


1102 


AAUUCUGU A CAGAACAA 


206 


TTGTTCTG GGCTAGCTACAACGA ACAGAATT 


3960 


1213 


CCACUCCU A UGACAACA 


218 


TGTTGTCA GGCTAGCTACAACGA AGGAGTGG 


3961 


1416 


GCCCAUGU A CAAAGUGA 


245 


TCACTTTG GGCTAGCTACAACGA ACATGGGC 


3962 


1431 


GAACUCAU A CAGAUAAA 


247 


TTTATCTG GGCTAGCTACAACGA ATGAGTTC 


3963 


1476 


AAAAGAUU A CCUGCAGC 


251 


GCTGCAGG GGCTAGCTACAACGA AATCTTTT 


3964 


1531 


CGGCAUUU A CUGUGAUU 


261 


AATCACAG GGCTAGCTACAACGA AAATGCCG 


3965 


1550 


GAAGAAAU A UCCAACUG 


264 


CAGTTGGA GGCTAGCTACAACGA ATTTCTTC 


3966 


1603 


ACAACACU A UAAGUGGG 


268 


CCCACTTA GGCTAGCTACAACGA AGTGTTGT 


3967 


1716 


GGAGGUUU A CAGACAUA 


285 


TATGTCTG GGCTAGCTACAACGA AAACCTCC 


3968 


1724 


ACAGACAU A UGCUUCAG 


286 


CTGAAGCA GGCTAGCTACAACGA ATGTCTGT 


3969 
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UGUUUCUU A UCACCUGG 



CCAGGTGA GGCTAGCTACAACGA AAGAAACA 



AAUGGCCU A CCUCCAAA 



TTTGGAGG GGCTAGCTACAACGA AGGCCATT 



UUGGAAAU A CAGUCUGC 



GCAGACTG GGCTAGCTACAACGA ATTTCCAA 



CCAAUGCU A CCCUGCCU 



AGGCAGGG GGCTAGCTACAACGA AGCATTGG 



CUCCAAUU A CAGUGACU 



AGTCACTG GGCTAGCTACAACGA AATTGGAG 



GGUAGUUU A UGCAAAUA 



TATTTGCA GGCTAGCTACAACGA AAACTACC 



AUGCAAAU A UUCGCCAA 



TTGGCGAA GGCTAGCTACAACGA ATTTGCAT 



AAACAGUU A CCUUGGAA 



TTCCAAGG GGCTAGCTACAACGA AACTGTTT 



UUGGAACU A CUGGAUAA 



TTATCCAG GGCTAGCTACAACGA AGTTCCAA 



CUGAUGCU A CUAAGGAU 



ATCCTTAG GGCTAGCTACAACGA AGCATCAG 



CGGUGUCU A CUCAAGGU 



ACCTTGAG GGCTAGCTACAACGA AGACACCG 



CUCAAGGU A UUUCACAA 



TTGTGAAA GGCTAGCTACAACGA ACCTTGAG 



CACAACUU A UGACACGA 



TCGTGTCA GGCTAGCTACAACGA AAGTTGTG 



UGGUAGAU A CAGUGUAA 



TTACACTG GGCTAGCTACAACGA ATCTACCA 



AGAGUGAU A CCCCAGCA 



TGCTGGGG GGCTAGCTACAACGA ATCACTCT 



AGCACUGU A CAUACCUG 



CAGGTATG GGCTAGCTACAACGA ACAGTGCT 



CUGUACAU A CCUGGCUG 



CAGCCAGG GGCTAGCTACAACGA ATGTACAG 



GAUGAAAU A CAAUGGAA 



TTCCATTG GGCTAGCTACAACGA ATTTCATC 



GCUCCCAU A CCUGAUCU 



AGATCAGG GGCTAGCTACAACGA ATGGGAGC 



GGAUGAUU A UGACCAUG 



CATGGTCA GGCTAGCTACAACGA AATCATCC 



UCACAAGU A UAUCAUUC 



GAATGATA GGCTAGCTACAACGA ACTTGTGA 



ACAAGUAU A UCAUUCGA 



TCGAATGA GGCTAGCTACAACGA ATACTTGT 



GAAUAAGU A CAAGUAUU 



AATACTTG GGCTAGCTACAACGA ACTTATTC 



GUACAAGU A UUCUUGAU 



ATCAAGAA GGCTAGCTACAACGA ACTTGTAC 



AAGUGAAU A CUACUGCU 



AGCAGTAG GGCTAGCTACAACGA ATTCACTT 



UGAAUACU A CUGCUCUC 



GAGAGCAG GGCTAGCTACAACGA AGTATTCA 



AAAACAUU A CUUUUGAA 



TTCAAAAG GGCTAGCTACAACGA AATGTTTT 



UCAUUGCU A UUCAGGCU 



AGCCTGAA GGCTAGCTACAACGA AGCAATGA 



UCAGAAAU A UCCAACAU 



ATGTTGGA GGCTAGCTACAACGA ATTTCTGA 



GCACGAGU A UCUUUGUU 



AACAAAGA GGCTAGCTACAACGA ACTCGTGC 



CUUUGUUU A UUCCUCCA 



TGGAGGAA GGCTAGCTACAACGA AAACAAAG 



GUCCUAAU A UUCAUAUC 



GATATGAA GGCTAGCTACAACGA ATTAGGAC 



AUAUUCAU A UCAACAGC 



GCTGTTGA GGCTAGCTACAACGA ATGAATAT 



UAAAAAUU A UGUGGAAG 



CTTCCACA GGCTAGCTACAACGA AATTTTTA 



UUUUGAUU A UAAAAUUU 



AAATTTTA GGCTAGCTACAACGA AATCAAAA 



UAAAAUGU A UUUUAGAC 



GTCTAAAA GGCTAGCTACAACGA ACATTTTA 



GGGGCGAU A UACUAAAU 



ATTTAGTA GGCTAGCTACAACGA ATCGCCCC 



GGGGCGAU A UACUAAAU 



ATTTAGTA GGCTAGCTACAACGA ATCGCCCC 



GGCGAUAU A CUAAAUGU 



ACATTTAG GGCTAGCTACAACGA ATATCGCC 



GGCGAUAU A CUAAAUGU 



ACATTTAG GGCTAGCTACAACGA ATATCGCC 



CUAAAUGU A UAUAGUAC 



GTACTATA GGCTAGCTACAACGA ACATTTAG 



AAAUGUAU A UAGUACAU 



ATGTACTA GGCTAGCTACAACGA ATACATTT 



UAUAUAGU A CAUUUAUA 



TATAAATG GGCTAGCTACAACGA ACTATATA 



GUACAUUU A UACUAAAU 



ATTTAGTA GGCTAGCTACAACGA AAATGTAC 



ACAUUUAU A CUAAAUGU 



ACATTTAG GGCTAGCTACAACGA ATAAATGT 



CUAAAUGU A UUCCUGUA 



TACAGGAA GGCTAGCTACAACGA ACATTTAG 



CUAAAUGU A UUUUAGAC 



GTCTAAAA GGCTAGCTACAACGA ACATTTAG 



AGGGGAGC A UGAAGAGG 



CCTCTTCA GGCTAGCTACAACGA GCTCCCCT 



93 



UGUCAAGC A UCUGGCAC 



GTGCCAGA GGCTAGCTACAACGA GCTTGACA 



100 
161 
195 
197 
231 
267 
299 



CAUCUGGC A CAGCUGAA 



TTCAGCTG GGCTAGCTACAACGA GCCAGATG 



UUGUUAUC A UUCUCCUA 



TAGGAGAA GGCTAGCTACAACGA GATAACAA 



AGUAAAAC A CAUCAGGU 



ACCTGATG GGCTAGCTACAACGA GTTTTACT 



UAAAACAC A UCAGGUCA 



TGACCTGA GGCTAGCTACAACGA GTGTTTTA 



GAUAAACC A CUUCCGAU 



ATCGGAAG GGCTAGCTACAACGA GGTTTATC 



AUAUUUUC A UAUCUGUA 



TACAGATA GGCTAGCTACAACGA GAAAATAT 



AGAAAGAC A CCUUCGUA 



TACGAAGG GGCTAGCTACAACGA GTCTTTCT 



103 



UAACCCGC A UUUUCCAA 



TTGGAAAA GGCTAGCTACAACGA GCGGGTTA 



GAGGAAUC A CAGGGAGA 



TCTCCCTG GGCTAGCTACAACGA GATTCCTC 



AUGGGGCC A UUUAAGAG 



CTCTTAAA GGCTAGCTACAACGA GGCCCCAT 



CUGUGUUC A UCUUGAUU 



AATCAAGA GGCTAGCTACAACGA GAACACAG 



GAUUCUUC A CCUUCUAG 



CTAGAAGG GGCTAGCTACAACGA GAAGAATC 



AGUAAUUC A CUCAUUCA 



TGAATGAG GGCTAGCTACAACGA GAATTACT 



AUUCACUC A UUCAGCUG 



CAGCTGAA GGCTAGCTACAACGA GAGTGAAT 



AUGAAGGC A UUGUCGUU 



AACGACAA GGCTAGCTACAACGA GCCTTCAT 



GAUGAAAC A CUCAUUCA 



TGAATGAG GGCTAGCTACAACGA GTTTCATC 



AAACACUC A UUCAACAA 



TTGTTGAA GGCTAGCTACAACGA GAGTGTTT 



UAAAGGAC A UGGUGACC 



GGTCACCA GGCTAGCTACAACGA GTCCTTTA 



ACCCAGGC A UCUCUGUA 



TACAGAGA GGCTAGCTACAACGA GCCTGGGT 



AUGUUGCC A UUUUGAUU 



AATCAAAA GGCTAGCTACAACGA GGCAACAT 



CCUGAAAC A UGGAAGAC 



GTCTTCCA GGCTAGCTACAACGA GTTTCAGG 



AACCCUAC A CUGAGCAG 



CTGCTCAG GGCTAGCTACAACGA GTAGGGTT 



AAGGAUCC A CCUCACUC 



GAGTGAGG GGCTAGCTACAACGA GGATCCTT 



UCCACCUC A CUCCUGAU 



AT C AG GAG GGCTAGCTACAACGA GAGGTGGA 



CUGAUUUC A UUGCAGGA 



TCCTGCAA GGCTAGCTACAACGA GAAATCAG 



UAUGGACC A CAAGGUAA 



TTACCTTG GGCTAGCTACAACGA GGTCCATA 



GGUAAGGC A UUUGUCCA 



TGGACAAA GGCTAGCTACAACGA GCCTTACC 



AUUUGUCC A UGAGUGGG 



CCCACTCA GGCTAGCTACAACGA GGACAAAT 



GUGGGCUC A UCUACGAU 



ATCGTAGA GGCTAGCTACAACGA GAGCCCAC 



GCUGUUAC A CCAAAAGA 



TCTTTTGG GGCTAGCTACAACGA GTAACAGC 



AAAGAUGC A CAUUCAAU 



ATTGAATG GGCTAGCTACAACGA GCATCTTT 



AGAUGCAC A UUCAAUAA 



TTATTGAA GGCTAGCTACAACGA GTGCATCT 



AUGUUUGC A CAACAUGU 



ACATGTTG GGCTAGCTACAACGA GCAAACAT 



UGCACAAC A UGUUGAUU 



AATCAACA GGCTAGCTACAACGA GTTGTGCA 



ACAAAACC A CAACAAAG 



CTTTGTTG GGCTAGCTACAACGA GGTTTTGT 



UCCGAAGC A CAUGGGAA 



TTCCCATG GGCTAGCTACAACGA GCTTCGGA 



CGAAGCAC A UGGGAAGU 



ACTTCCCA GGCTAGCTACAACGA GTGCTTCG 



AGAAAACC A CUCCUAUG 



CATAGGAG GGCTAGCTACAACGA GGTTTTCT 



AUGACAAC A CAGCCACC 



GGTGGCTG GGCTAGCTACAACGA GTTGTCAT 



ACACAGCC A CCAAAUCC 



GGATTTGG GGCTAGCTACAACGA GGCTGTGT 



CAAAUCCC A CCUUCUCA 



TGAGAAGG GGCTAGCTACAACGA GGGATTTG 



ACCUUCUC A UUGCUGCA 



TGCAGCAA GGCTAGCTACAACGA GAGAAGGT 



CUGGAAGC A UGGCGACU 



AGTCGCCA GGCTAGCTACAACGA GCTTCCAG 



AUGGUGAC A UUUGACAG 



CTGTCAAA GGCTAGCTACAACGA GTCACCAT 



UGCUGCCC A UGUACAAA 



TTTGTACA GGCTAGCTACAACGA GGGCAGCA 



GUGAACUC A UACAGAUA 



TATCTGTA GGCTAGCTACAACGA GAGTTCAC 



ACAGGGAC A CACUCGCC 



GGCGAGTG GGCTAGCTACAACGA GTCCCTGT 



AGGGACAC A CUCGCCAA 



TTGGCGAG GGCTAGCTACAACGA GTGTCCCT 



GGACGUCC A UCUGCAGC 



GCTGCAGA GGCTAGCTACAACGA GGACGTCC 



CGAUCGGC A UUUACUGU 



AC AG T AAA GGCTAGCTACAACGA GCCGATCG 



AAGACAAC A CUAUAAGU 



ACTTATAG GGCTAGCTACAACGA GTTGTCTT 



GUGGUGCC A UCAUCCAC 



GTGGATGA GGCTAGCTACAACGA GGCACCAC 



GUGCCAUC A UCCACACA 



TGTGTGGA GGCTAGCTACAACGA GATGGCAC 



CAUCAUCC A CACAGUCG 



CGACTGTG GGCTAGCTACAACGA GGATGATG 



UCAUCCAC A CAGUCGCU 



AGCGACTG GGCTAGCTACAACGA GTGGATGA 



UUACAGAC A UAUGCUUC 



GAAGCATA GGCTAGCTACAACGA GTCTGTAA 



AUGGCCUC A UUGAUGCU 



AGCATCAA GGCTAGCTACAACGA GAGGCCAT 



GCCCUUUC A UCAGGAAA 



TTTCCTGA GGCTAGCTACAACGA GAAAGGGC 
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AGCGCUCC A UCCAGCUU 



AAGCTGGA GGCTAGCTACAACGA GGAGCGCT 
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UGAAUGGC A CAGUGAUC 



GATCACTG GGCTAGCTACAACGA GCCATTCA 



UGGACAGC A CCGUGGGA 



TCCCACGG GGCTAGCTACAACGA GCTGTCCA 



GAAAGGAC A CUUUGUUU 



AAACAAAG GGCTAGCTACAACGA GTCCTTTC 



UUCUUAUC A CCUGGACA 



TGTCCAGG GGCTAGCTACAACGA GATAAGAA 
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ACAAAAAC A CCAAAAUG 



CATTTTGG GGCTAGCTACAACGA GTTTTTGT 



UCCCAGGC A UUGCUAAG 



CTTAGCAA GGCTAGCTACAACGA GCCTGGGA 



AGGUUGGC A CUUGGAAA 



TTTCCAAG GGCTAGCTACAACGA GCCAACCT 



GCAAGCUC A CAAACCUU 



AAGGTTTG GGCTAGCTACAACGA GAGCTTGC 



UGACUGUC A CGUCCCGU 



ACGGGACG GGCTAGCTACAACGA GACAGTCA 



ACAAGGAC A CCAGCAAA 



TTTGCTGG GGCTAGCTACAACGA GTCCTTGT 



CCAGUGUC A CAGCCCUG 



CAGGGCTG GGCTAGCTACAACGA GACACTGG 



GGUAUUUC A CAACUUAU 



ATAAGTTG GGCTAGCTACAACGA GAAATACC 



CUUAUGAC A CGAAUGGU 



ACCATTCG GGCTAGCTACAACGA GTCATAAG 



AGUGGAGC A CUGUACAU 



ATGTACAG GGCTAGCTACAACGA GCTCCACT 



CACUGUAC A UACCUGGC 



GCCAGGTA GGCTAGCTACAACGA GTACAGTG 



UGGAAUCC A CCAAGACC 



GGTCTTGG GGCTAGCTACAACGA GGATTCCA 



UGUUCAAC A CAAGCAAG 



CTTGCTTG GGCTAGCTACAACGA GTTGAACA 



AGCAGAAC A UCCUCGGG 



CCCGAGGA GGCTAGCTACAACGA GTTCTGCT 



GGAGGCUC A UUUGUGGC 



GCCACAAA GGCTAGCTACAACGA GAGCCTCC 



AUGCUCCC A UACCUGAU 



ATCAGGTA GGCTAGCTACAACGA GGGAGCAT 



CUCUUCCC A CCUGGCCA 



TGGCCAGG GGCTAGCTACAACGA GGGAAGAG 



GCCAAAUC A CCGACCUG 



CAGGTCGG GGCTAGCTACAACGA GATTTGGC 



GGAAAUUC A CGGGGGCA 



TGCCCCCG GGCTAGCTACAACGA GAATTTCC 



GCAGUCUC A UUAAUCUG 



CAGATTAA GGCTAGCTACAACGA GAGACTGC 



UUAUGACC A UGGAACAG 



CTGTTCCA GGCTAGCTACAACGA GGTCATAA 



AACAGCUC A CAAGUAUA 



TATACTTG GGCTAGCTACAACGA GAGCTGTT 



AGUAUAUC A UUCGAAUA 



TATTCGAA GGCTAGCTACAACGA GATATACT 



CUGCUCUC A UCCCAAAG 



CTTTGGGA GGCTAGCTACAACGA GAGAGCAG 



CAGAAAAC A UUACUUUU 



AAAAGTAA GGCTAGCTACAACGA GTTTTCTG 



AAAAUGGC A CAGAUCUU 



AAGATCTG GGCTAGCTACAACGA GCCATTTT 



AUCUUUUC A UUGCUAUU 



AATAGCAA GGCTAGCTACAACGA GAAAAGAT 



UAUCCAAC A UUGCACGA 



TCGTGCAA GGCTAGCTACAACGA GTTGGATA 



AACAUUGC A CGAGUAUC 



GATACTCG GGCTAGCTACAACGA GCAATGTT 



AUUCCUCC A CAGACUCC 



GGAGTCTG GGCTAGCTACAACGA GGAGGAAT 



CCAGAGAC A CCUAGUCC 



GGACTAGG GGCTAGCTACAACGA GTCTCTGG 



UAAUAUUC A UAUCAACA 



TGTTGATA GGCTAGCTACAACGA GAATATTA 



UCAACAGC A CCAUUCCU 



AGGAATGG GGCTAGCTACAACGA GCTGTTGA 



ACAGCACC A UUCCUGGC 



GCCAGGAA GGCTAGCTACAACGA GGTGCTGT 



UUCCUGGC A UUCACAUU 



AATGTGAA GGCTAGCTACAACGA GCCAGGAA 



UGGCAUUC A CAUUUUAA 



TTAAAATG GGCTAGCTACAACGA GAATGCCA 



GCAUUCAC A UUUUAAAA 



TTTTAAAA GGCTAGCTACAACGA GTGAATGC 



AAUAAAUC A UUCAUCCU 



AGGATGAA GGCTAGCTACAACGA GATTTATT 



AAUCAUUC A UCCUUUUU 



AAAAAGGA GGCTAGCTACAACGA GAATGATT 



UAUAGUAC A UUUAUACU 



AGTATAAA GGCTAGCTACAACGA GTACTATA 



ACAAGUAC G CAAUUUGA 



TCAAATTG GGCTAGCTACAACGA GTACTTGT 



CGUAACCC G CAUUUUCC 



GGAAAATG GGCTAGCTACAACGA GGGTTACG 



UUGUCGUU G CAAUCGAC 



GTCGATTG GGCTAGCTACAACGA AACGACAA 



CCCAAUGU G CCAGAAGA 



TCTTCTGG GGCTAGCTACAACGA ACATTGGG 



AAAAUGUU G CCAUUUUG 



CAAAATGG GGCTAGCTACAACGA AACATTTT 



ACAAAAAU G CUGAUGUU 



AACATCAG GGCTAGCTACAACGA ATTTTTGT 



UUCUGGUU G CUGAGUCU 



AGACTCAG GGCTAGCTACAACGA AACCAGAA 



AUUUCAUU G CAGGAAAA 



TTTTCCTG GGCTAGCTACAACGA AATGAAAT 



CAAAAGAU G CACAUUCA 



TGAATGTG GGCTAGCTACAACGA ATCTTTTG 



CCAAUCCC G CCAGACGG 



CCGTCTGG GGCTAGCTACAACGA GGGATTGG 



UAAUGUUU G CACAACAU 



ATGTTGTG GGCTAGCTACAACGA AAACATTA 



UCAAAAAU G CAAUCUCC 



GGAGATTG GGCTAGCTACAACGA ATTTTTGA 



UUCUCAUU G CUGCAGAU 



ATCTGCAG GGCTAGCTACAACGA AATGAGAA 



UCAUUGCU G CAGAUUGG 



CCAATCTG GGCTAGCTACAACGA AGCAATGA 



UGGUAACC G CCUCAAUC 



GATTGAGG GGCTAGCTACAACGA GGTTACCA 



CUUUUCCU G CUGCAGAC 



GTCTGCAG GGCTAGCTACAACGA AGGAAAAG 
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UUCCUGCU G CAGACAGU 



ACTGTCTG GGCTAGCTACAACGA AGCAGGAA 



UUGACAGU G CUGCCCAU 



ATGGGCAG GGCTAGCTACAACGA ACTGTCAA 



ACAGUGCU G CCCAUGUA 



TACATGGG GGCTAGCTACAACGA AGCACTGT 



ACACACUC G CCAAAAGA 



TCTTTTGG GGCTAGCTACAACGA GAGTGTGT 



GAUUACCU G CAGCAGCU 



AGCTGCTG GGCTAGCTACAACGA AGGTAATC 



GUCCAUCU G CAGCGGGC 



GCCCGCTG GGCTAGCTACAACGA AGATGGAC 



GAAAUUGU G CUGCUGAC 



GTCAGCAG GGCTAGCTACAACGA ACAATTTC 



AUUGUGCU G CUGACGGA 



TCCGTCAG GGCTAGCTACAACGA AGCACAAT 



AAGUGGGU G CUUUAACG 



CGTTAAAG GGCTAGCTACAACGA ACCCACTT 



AAAGUGGU G CCAUCAUC 



GATGATGG GGCTAGCTACAACGA ACCACTTT 



ACACAGUC G CUUUGGGG 



CCCCAAAG GGCTAGCTACAACGA GACTGTGT 



GGCCCUCU G CAGCUCAA 



TTGAGCTG GGCTAGCTACAACGA AGAGGGCC 



AGACAUAU G CUUCAGAU 



ATCTGAAG GGCTAGCTACAACGA ATATGTCT 



UCAUUGAU G CUUUUGGG 



CCCAAAAG GGCTAGCTACAACGA ATCAATGA 



CUCUCAGC G CUCCAUCC 



GGATGGAG GGCTAGCTACAACGA GCTGAGAG 



UGGACAAC G CAGCCUCC 



GGAGGCTG GGCTAGCTACAACGA GTTGTCCA 



CAGGCAUU G CUAAGGUU 



AACCTTAG GGCTAGCTACAACGA AATGCCTG 



UACAGUCU G CAAGCAAG 



CTTGCTTG GGCTAGCTACAACGA AGACTGTA 



CGUCCCGU G CGUCCAAU 



ATTGGACG GGCTAGCTACAACGA ACGGGACG 



CGUCCAAU G CUACCCUG 



CAGGGTAG GGCTAGCTACAACGA ATTGGACG 



GCUACCCU G CCUCCAAU 



ATTGGAGG GGCTAGCTACAACGA AGGGTAGC 



UAGUUUAU G CAAAUAUU 



AATATTTG GGCTAGCTACAACGA ATAAACTA 



AAAUAUUC G CCAAGGAG 



CTCCTTGG GGCTAGCTACAACGA GAATATTT 



GAGCAGGU G CUGAUGCU 



AGCATCAG GGCTAGCTACAACGA ACCTGCTC 



GUGCUGAU G CUACUAAG 



CTTAGTAG GGCTAGCTACAACGA ATCAGCAC 



GUAAAAGU G CGGGCUCU 



AGAGCCCG GGCTAGCTACAACGA ACTTTTAC 



GAGUUAAC G CAGCCAGA 



TCTGGCTG GGCTAGCTACAACGA GTTAACTC 



UCCCAAAU G CUCCCAUA 



TATGGGAG GGCTAGCTACAACGA ATTTGGGA 



AUACUACU G CUCUCAUC 



GATGAGAG GGCTAGCTACAACGA AGTAGTAT 
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UUUUCAUU G CUAUUCAG 



CTGAATAG GGCTAGCTACAACGA AATGAAAA 
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CCAACAUU G CACGAGUA 



TACTCGTG GGCTAGCTACAACGA AATGTTGG 



CAGACUCC G CCAGAGAC 



GTCTCTGG GGCTAGCTACAACGA GGAGTCTG 



AAACGUCU G CUCCUUGU 



ACAAGGAG GGCTAGCTACAACGA AGACGTTT 



GGAGAACU G CAGCUGUC 



GACAGCTG GGCTAGCTACAACGA AGTTCTCC 



AAUAAAAU G CUAAACAA 



TTGTTTAG GGCTAGCTACAACGA ATTTTATT 



27 



AAAUGGAU G UGGAAUAU 



ATATTCCA GGCTAGCTACAACGA ATCCATTT 



AUUUUCUU G UUUAAGGG 



CCCTTAAA GGCTAGCTACAACGA AAGAAAAT 



GAAGAGGU G UUGAGGUU 



AACCTCAA GGCTAGCTACAACGA ACCTCTTC 



GAGGUUAU G UCAAGCAU 



ATGCTTGA GGCTAGCTACAACGA ATAACCTC 



AAGAUAUU G UUAUCAUU 



AATGATAA GGCTAGCTACAACGA AATATCTT 



AAAGACCU G UGAUAAAC 



GTTTATCA GGCTAGCTACAACGA AGGTCTTT 



GGAAACGU G UGUCUAUA 



TATAGACA GGCTAGCTACAACGA ACGTTTCC 



AAACGUGU G UCUAUAUU 



AATATAGA GGCTAGCTACAACGA ACACGTTT 



UCAUAUCU G UAUAUAUA 



TATATATA GGCTAGCTACAACGA AGATATGA 



AGGGAGAU G UACAGCAA 



TTGCTGTA GGCTAGCTACAACGA ATCTCCCT 



AGAGUUCU G UGUUCAUC 



GATGAACA GGCTAGCTACAACGA AGAACTCT 



AGUUCUGU G UUCAUCUU 



AAGATGAA GGCTAGCTACAACGA ACAGAACT 



AAGGCAUU G UCGUUGCA 



TGCAACGA GGCTAGCTACAACGA AATGCCTT 



ACCCCAAU G UGCCAGAA 



TTCTGGCA GGCTAGCTACAACGA ATTGGGGT 



GCAUCUCU G UAUCUGUU 



AACAGATA GGCTAGCTACAACGA AGAGATGC 



CUGUAUCU G UUUGAAGC 



GCTTCAAA GGCTAGCTACAACGA AGATACAG 



UCAAAAAU G UUGCCAUU 



AATGGCAA GGCTAGCTACAACGA ATTTTTGA 



CUGACUAU G UGAGACCA 



TGGTCTCA GGCTAGCTACAACGA ATAGTCAG 



AUGCUGAU G UUCUGGUU 



AACCAGAA GGCTAGCTACAACGA ATCAGCAT 



GGGCAACU G UGGAGAGA 



TCTCTCCA GGCTAGCTACAACGA AGTTGCCC 



AGGCAUUU G UCCAUGAG 



CTCATGGA GGCTAGCTACAACGA AAATGCCT 
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AGUAAGAU G UUCAGCAG 



CTGCTGAA GGCTAGCTACAACGA ATCTTACT 



GUACAAAU G UAGUAAAG 



CTTTACTA GGCTAGCTACAACGA ATTTGTAC 



AAAGAAGU G UCAGGGAG 



CTCCCTGA GGCTAGCTACAACGA ACTTCTTT 



AGGCAGCU G UUACACCA 



TGGTGTAA GGCTAGCTACAACGA AGCTGCCT 



AAAAGGAU G UGAGUUUG 



CAAACTCA GGCTAGCTACAACGA ATCCTTTT 



GUGAGUUU G UUCUCCAA 



TTGGAGAA GGCTAGCTACAACGA AAACTCAC 



UCUAUAAU G UUUGCACA 



TGTGCAAA GGCTAGCTACAACGA ATTATAGA 



CACAACAU G UUGAUUCU 



AGAATCAA GGCTAGCTACAACGA ATGTTGTG 



UGAAUUCU G UACAGAAC 



GTTCTGTA GGCTAGCTACAACGA AGAATTCA 



AAAGAAUU G UGUGUUUA 



TAAACACA GGCTAGCTACAACGA AATTCTTT 



AGAAUUGU G UGUUUAGU 



ACTAAACA GGCTAGCTACAACGA ACAATTCT 



AAUUGUGU G UUUAGUCC 



GGACTAAA GGCTAGCTACAACGA ACACAATT 



CUGCCCAU G UACAAAGU 



ACTTTGTA GGCTAGCTACAACGA ATGGGCAG 



CAUUUACU G UGAUUAGG 



CCTAATCA GGCTAGCTACAACGA AGTAAATG 



CUGAAAUU G UGCUGCUG 



CAGCAGCA GGCTAGCTACAACGA AATTTCAG 



GAGGAGCU G UCCAAAAU 



ATTTTGGA GGCTAGCTACAACGA AGCTCCTC 



AUGGAGCU G UCUCUCAG 



CTGAGAGA GGCTAGCTACAACGA AGCTCCAT 



GACACUUU G UUUCUUAU 



ATAAGAAA GGCTAGCTACAACGA AAAGTGTC 



GUGGCUUU G UAGUGGAC 



GTCCACTA GGCTAGCTACAACGA AAAGCCAC 



CCCUGACU G UCACGUCC 



GGACGTGA GGCTAGCTACAACGA AGTCAGGG 



GGGCCAGU G UCACAGCC 



GGCTGTGA GGCTAGCTACAACGA ACTGGCCC 



AUGACGGU G UCUACUCA 



TGAGTAGA GGCTAGCTACAACGA ACCGTCAT 



GAUACAGU G UAAAAGUG 



CACTTTTA GGCTAGCTACAACGA ACTGTATC 



GGAGCACU G UACAUACC 



GGTATGTA GGCTAGCTACAACGA AGTGCTCC 



AGGAUGAU G UUCAACAC 



GTGTTGAA GGCTAGCTACAACGA ATCATCCT 



AAGCAAGU G UGUUUCAG 



CTGAAACA GGCTAGCTACAACGA ACTTGCTT 



GCAAGUGU G UUUCAGCA 



TGCTGAAA GGCTAGCTACAACGA ACACTTGC 
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GCUCAUUU G UGGCUUCU 



AGAAGCCA GGCTAGCTACAACGA AAATGAGC 



CUUCUGAU G UCCCAAAU 



ATTTGGGA GGCTAGCTACAACGA ATCAGAAG 
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GUCUUUUU G UUUAAACC 



GGTTTAAA GGCTAGCTACAACGA AAAAAGAC 



UUCAGGCU G UUGAUAAG 



CTTATCAA GGCTAGCTACAACGA AGCCTGAA 



GUAUCUUU G UUUAUUCC 



GGAATAAA GGCTAGCTACAACGA AAAGATAC 



UGCUCCUU G UCCUAAUA 



TAT TAG G A GGCTAGCTACAACGA AAGGAGCA 



AAAAUUAU G UGGAAGUG 



CACTTCCA GGCTAGCTACAACGA ATAATTTT 



CUGCAGCU G UCAAUAGC 



GCTATTGA GGCTAGCTACAACGA AGCTGCAG 



GAAUUUUU G UCAGAUAA 



TTATCTGA GGCTAGCTACAACGA AAAAATTC 



UCUAAAAU G UAUUUUAG 



CTAAAATA GGCTAGCTACAACGA ATTTTAGA 



GACUUCCU G UAGGGGGC 



GCCCCCTA GGCTAGCTACAACGA AGGAAGTC 



GACUUCCU G UAGGGGGC 



GCCCCCTA GGCTAGCTACAACGA AGGAAGTC 



UACUAAAU G UAUAUAGU 



ACTATATA GGCTAGCTACAACGA ATTTAGTA 



UACUAAAU G UAUUCCUG 



CAGGAATA GGCTAGCTACAACGA ATTTAGTA 



GUAUUCCU G UAGGGGGC 



GCCCCCTA GGCTAGCTACAACGA AGGAATAC 



UACUAAAU G UAUUUUAG 



CTAAAATA GGCTAGCTACAACGA ATTTAGTA 



UGCUUUUG G UACAAAUG 



CATTTGTA GGCTAGCTACAACGA CAAAAGCA 



UAAGGGGA G CAUGAAGA 



TCTTCATG GGCTAGCTACAACGA TCCCCTTA 



AUGAAGAG G UGUUGAGG 



CCTCAACA GGCTAGCTACAACGA CTCTTCAT 



GUGUUGAG G UUAUGUCA 



TGACATAA GGCTAGCTACAACGA CTCAACAC 



UAUGUCAA G CAUCUGGC 



GCCAGATG GGCTAGCTACAACGA TTGACATA 



AGCAUCUG G CACAGCUG 



CAGCTGTG GGCTAGCTACAACGA CAGATGCT 



CUGGCACA G CUGAAGGC 



GCCTTCAG GGCTAGCTACAACGA TGTGCCAG 



AGCUGAAG G CAGAUGGA 



TCCATCTG GGCTAGCTACAACGA CTTCAGCT 



AUUUACAA G UACGCAAU 



ATTGCGTA GGCTAGCTACAACGA TTGTAAAT 



AGACAAGA G CAAUAGUA 



TACTATTG GGCTAGCTACAACGA TCTTGTCT 



GAGCAAUA G UAAAACAC 



GTGTTTTA GGCTAGCTACAACGA TATTGCTC 



CACAUCAG G UCAGGGGG 



CCCCCTGA GGCTAGCTACAACGA CTGATGTG 



GUCAGGGG G UUAAAGAC 



GTCTTTAA GGCTAGCTACAACGA CCCCTGAC 
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UCCGAUAA G UUGGAAAC 



GTTTCCAA GGCTAGCTACAACGA TTATCGGA 



UUGGAAAC G UGUGUCUA 



TAGACACA GGCTAGCTACAACGA GTTTCCAA 



AUAUAAUG G UAAAGAAA 



TTTCTTTA GGCTAGCTACAACGA CATTATAT 



ACACCUUC G UAACCCGC 



GCGGGTTA GGCTAGCTACAACGA GAAGGTGT 



GAUGUACA G CAAUGGGG 



CCCCATTG GGCTAGCTACAACGA TGTACATC 



GCAAUGGG G CCAUUUAA 



TTAAATGG GGCTAGCTACAACGA CCCATTGC 



AUUUAAGA G UUCUGUGU 



ACACAGAA GGCTAGCTACAACGA TCTTAAAT 



UAGAAGGG G CCCUGAGU 



ACTCAGGG GGCTAGCTACAACGA CCCTTCTA 



GGCCCUGA G UAAUUCAC 



GTGAATTA GGCTAGCTACAACGA TCAGGGCC 



CUCAUUCA G CUGAACAA 



TTGTTCAG GGCTAGCTACAACGA TGAATGAG 



CAACAAUG G CUAUGAAG 



CTTCATAG GGCTAGCTACAACGA CATTGTTG 



CUAUGAAG G CAUUGUCG 



CGACAATG GGCTAGCTACAACGA CTTCATAG 



GCAUUGUC G UUGCAAUC 



GATTGCAA GGCTAGCTACAACGA GACAATGC 



AGGACAUG G UGACCCAG 



CTGGGTCA GGCTAGCTACAACGA CATGTCCT 



UGACCCAG G CAUCUCUG 



CAGAGATG GGCTAGCTACAACGA CTGGGTCA 



UGUUUGAA G CUACAGGA 



TCCTGTAG GGCTAGCTACAACGA TTCAAACA 



ACAGGAAA G CGAUUUUA 



TAAAATCG GGCTAGCTACAACGA TTTCCTGT 



AGACAAAG G CUGACUAU 



ATAGTCAG GGCTAGCTACAACGA CTTTGTCT 



AUGUUCUG G UUGCUGAG 



CTCAGCAA GGCTAGCTACAACGA CAGAACAT 



GUUGCUGA G UCUACUCC 



GGAGTAGA GGCTAGCTACAACGA TCAGCAAC 



UCCUCCAG G UAAUGAUG 



CATCATTA GGCTAGCTACAACGA CTGGAGGA 



UACACUGA G CAGAUGGG 



CCCATCTG GGCTAGCTACAACGA TCAGTGTA 



GCAGAUGG G CAACUGUG 



CACAGTTG GGCTAGCTACAACGA CCATCTGC 



AGAGAAGG G UGAAAGGA 



TCCTTTCA GGCTAGCTACAACGA CCTTCTCT 



GGAAAAAA G UUAGCUGA 



TCAGCTAA GGCTAGCTACAACGA TTTTTTCC 



AAAAGUUA G CUGAAUAU 



ATATTCAG GGCTAGCTACAACGA TAACTTTT 



ACCACAAG G UAAGGCAU 



ATGCCTTA GGCTAGCTACAACGA CTTGTGGT 



AAGGUAAG G CAUUUGUC 



GACAAATG GGCTAGCTACAACGA CTTACCTT 



GUCCAUGA G UGGGCUCA 



TGAGCCCA GGCTAGCTACAACGA TCATGGAC 



AUGAGUGG G CUCAUCUA 



TAGATGAG GGCTAGCTACAACGA CCACTCAT 



GAUGGGGA G UAUUUGAC 



GTCAAATA GGCTAGCTACAACGA TCCCCATC 



UUUGACGA G UACAAUAA 



TTATTGTA GGCTAGCTACAACGA TCGTCAAA 



GAAUACAA G CAGUAAGA 



TCTTACTG GGCTAGCTACAACGA TTGTATTC 



UACAAGCA G UAAGAUGU 



ACATCTTA GGCTAGCTACAACGA TGCTTGTA 



GAUGUUCA G CAGGUAUU 



AATACCTG GGCTAGCTACAACGA TGAACATC 



UUCAGCAG G UAUUACUG 



CAGTAATA GGCTAGCTACAACGA CTGCTGAA 



UAUUACUG G UACAAAUG 



CATTTGTA GGCTAGCTACAACGA CAGTAATA 



CAAAUGUA G UAAAGAAG 



CTTCTTTA GGCTAGCTACAACGA TACATTTG 



GUAAAGAA G UGUCAGGG 



CCCTGACA GGCTAGCTACAACGA TTCTTTAC 



UCAGGGAG G CAGCUGUU 



AACAGCTG GGCTAGCTACAACGA CTCCCTGA 



GGGAGGCA G CUGUUACA 



TGTAACAG GGCTAGCTACAACGA TGCCTCCC 



UCAAUAAA G UUACAGGA 



TCCTGTAA GGCTAGCTACAACGA TTTATTGA 



GGAUGUGA G UUUGUUCU 



AGAACAAA GGCTAGCTACAACGA TCACATCC 



CGGAGAAG G CUUCUAUA 



TATAGAAG GGCTAGCTACAACGA CTTCTCCG 



AUUCUAUA G UUGAAUUC 



GAATTCAA GGCTAGCTACAACGA TATAGAAT 



ACAAAGAA G CUCCAAAC 



GTTTGGAG GGCTAGCTACAACGA TTCTTTGT 



CCAAACAA G CAAAAUCA 



TGATTTTG GGCTAGCTACAACGA TTGTTTGG 



UCUCCGAA G CACAUGGG 



CCCATGTG GGCTAGCTACAACGA TTCGGAGA 



CAUGGGAA G UGAUCCGU 



ACGGATCA GGCTAGCTACAACGA TTCCCATG 



AGUGAUCC G UGAUUCUG 



CAGAATCA GGCTAGCTACAACGA GGATCACT 



ACAACACA G CCACCAAA 



TTTGGTGG GGCTAGCTACAACGA TGTGTTGT 



UGUGUUUA G UCCUUGAC 



GTCAAGGA GGCTAGCTACAACGA TAAACACA 



AUCUGGAA G CAUGGCGA 



TCGCCATG GGCTAGCTACAACGA TTCCAGAT 



GAAGCAUG G CGACUGGU 



ACCAGTCG GGCTAGCTACAACGA CATGCTTC 



GGCGACUG G UAACCGCC 



GGCGGTTA GGCTAGCTACAACGA CAGTCGCC 



UGAAUCAA G CAGGCCAG 



CTGGCCTG GGCTAGCTACAACGA TTGATTCA 
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UCAAGCAG G CCAGCUUU 



AAAGCTGG GGCTAGCTACAACGA CTGCTTGA 



GCAGGCCA G CUUUUCCU 



AGGAAAAG GGCTAGCTACAACGA TGGCCTGC 



UGCAGACA G UUGAGCUG 



CAGCTCAA GGCTAGCTACAACGA TGTCTGCA 



ACAGUUGA G CUGGGGUC 



GACCCCAG GGCTAGCTACAACGA TCAACTGT 



GAGCUGGG G UCCUGGGU 



ACCCAGGA GGCTAGCTACAACGA CCCAGCTC 



GGUCCUGG G UUGGGAUG 



CATCCCAA GGCTAGCTACAACGA CCAGGACC 



UUGGGAUG G UGACAUUU 



AAATGTCA GGCTAGCTACAACGA CATCCCAA 



AUUUGACA G UGCUGCCC 



GGGCAGCA GGCTAGCTACAACGA TGTCAAAT 



UGUACAAA G UGAACUCA 



TGAGTTCA GGCTAGCTACAACGA TTTGTACA 



GAUAAACA G UGGCAGUG 



CACTGCCA GGCTAGCTACAACGA TGTTTATC 



AAACAGUG G CAGUGACA 



TGTCACTG GGCTAGCTACAACGA CACTGTTT 



CAGUGGCA G UGACAGGG 



CCCTGTCA GGCTAGCTACAACGA TGCCACTG 



UACCUGCA G CAGCUUCA 



TGAAGCTG GGCTAGCTACAACGA TGCAGGTA 



CUGCAGCA G CUUCAGGA 



TCCTGAAG GGCTAGCTACAACGA TGCTGCAG 



GGAGGGAC G UCCAUCUG 



CAGATGGA GGCTAGCTACAACGA GTCCCTCC 



CAUCUGCA G CGGGCUUC 



GAAGCCCG GGCTAGCTACAACGA TGCAGATG 



UGCAGCGG G CUUCGAUC 



GATCGAAG GGCTAGCTACAACGA CCGCTGCA 



UUCGAUCG G CAUUUACU 



AGTAAATG GGCTAGCTACAACGA CGATCGAA 



CACUAUAA G UGGGUGCU 



AGCACCCA GGCTAGCTACAACGA TTATAGTG 



AUAAGUGG G UGCUUUAA 



TTAAAGCA GGCTAGCTACAACGA CCACTTAT 



UUAACGAG G UCAAACAA 



TTGTTTGA GGCTAGCTACAACGA CTCGTTAA 



1634 
1637 



CAAACAAA G UGGUGCCA 



TGGCACCA GGCTAGCTACAACGA TTTGTTTG 



ACAAAGUG G UGCCAUCA 



TGATGGCA GGCTAGCTACAACGA CACTTTGT 



1654 
1665 
1675 
1692 
1712 
1738 
1751 
1771 
1792 
1803 



UCCACACA G UCGCUUUG 



CAAAGCGA GGCTAGCTACAACGA TGTGTGGA 



GCUUUGGG G CCCUCUGC 



GCAGAGGG GGCTAGCTACAACGA CCCAAAGC 



CCUCUGCA G CUCAAGAA 



TTCTTGAG GGCTAGCTACAACGA TGCAGAGG 



CUAGAGGA G CUGUCCAA 



TTGGACAG GGCTAGCTACAACGA TCCTCTAG 



GACAGGAG G UUUACAGA 



TCTGTAAA GGCTAGCTACAACGA CTCCTGTC 



CAGAUCAA G UUCAGAAC 



GTTCTGAA GGCTAGCTACAACGA TTGATCTG 



GAACAAUG G CCUCAUUG 



CAATGAGG GGCTAGCTACAACGA CATTGTTC 



CUUUUGGG G CCCUUUCA 



TGAAAGGG GGCTAGCTACAACGA CCCAAAAG 



GAAAUGGA G CUGUCUCU 



AGAGACAG GGCTAGCTACAACGA TCCATTTC 



GUCUCUCA G CGCUCCAU 



ATGGAGCG GGCTAGCTACAACGA TGAGAGAC 



1815 



UCCAUCCA G CUUGAGAG 



CTCTCAAG GGCTAGCTACAACGA TGGATGGA 



1823 
1847 



GCUUGAGA G UAAGGGAU 



ATCCCTTA GGCTAGCTACAACGA TCTCAAGC 



CCAGAACA G CCAGUGGA 



TCCACTGG GGCTAGCTACAACGA TGTTCTGG 



AACAGCCA G UGGAUGAA 



TTCATCCA GGCTAGCTACAACGA TGGCTGTT 



GAUGAAUG G CACAGUGA 



TCACTGTG GGCTAGCTACAACGA CATTCATC 



AUGGCACA G UGAUCGUG 



C AC GAT C A GGCTAGCTACAACGA TGTGCCAT 



CAGUGAUC G UGGACAGC 



GCTGTCCA GGCTAGCTACAACGA GATCACTG 



CGUGGACA G CACCGUGG 



CCACGGTG GGCTAGCTACAACGA TGTCCACG 



ACAGCACC G UGGGAAAG 



CTTTCCCA GGCTAGCTACAACGA GGTGCTGT 



ACAACGCA G CCUCCCCA 



TGGGGAGG GGCTAGCTACAACGA TGCGTTGT 



GGAUCCCA G UGGACAGA 



TCTGTCCA GGCTAGCTACAACGA TGGGATCC 



GGACAGAA G CAAGGUGG 



CCACCTTG GGCTAGCTACAACGA TTCTGTCC 



GAAGCAAG G UGGCUUUG 



CAAAGCCA GGCTAGCTACAACGA CTTGCTTC 



GCAAGGUG G CUUUGUAG 



CTACAAAG GGCTAGCTACAACGA CACCTTGC 



GCUUUGUA G UGGACAAA 



TTTGTCCA GGCTAGCTACAACGA TACAAAGC 



CCAAAAUG G CCUACCUC 



GAGGTAGG GGCTAGCTACAACGA CATTTTGG 



AAUCCCAG G CAUUGCUA 



TAGCAATG GGCTAGCTACAACGA CTGGGATT 



UUGCUAAG G UUGGCACU 



AGTGCCAA GGCTAGCTACAACGA CTTAGCAA 



UAAGGUUG G CACUUGGA 



TCCAAGTG GGCTAGCTACAACGA CAACCTTA 



GAAAUACA G UCUGCAAG 



CTTGCAGA GGCTAGCTACAACGA TGTATTTC 



GUCUGCAA G CAAGCUCA 



TGAGCTTG GGCTAGCTACAACGA TTGCAGAC 



GCAAGCAA G CUCACAAA 



TTTGTGAG GGCTAGCTACAACGA TTGCTTGC 



ACUGUCAC G UCCCGUGC 



GCACGGGA GGCTAGCTACAACGA GTGACAGT 
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CACGUCCC G UGCGUCCA 



TGGACGCA GGCTAGCTACAACGA GGGACGTG 



UCCCGUGC G UCCAAUGC 



GCATTGGA GGCTAGCTACAACGA GCACGGGA 



CAAUUACA G UGACUUCC 



GGAAGTCA GGCTAGCTACAACGA TGTAATTG 



GGACACCA G CAAAUUCC 



GGAATTTG GGCTAGCTACAACGA TGGTGTCC 



AUUCCCCA G CCCUCUGG 



CCAGAGGG GGCTAGCTACAACGA TGGGGAAT 



GCCCUCUG G UAGUUUAU 



ATAAACTA GGCTAGCTACAACGA CAGAGGGC 



CUCUGGUA G UUUAUGCA 



TGCATAAA GGCTAGCTACAACGA TACCAGAG 



GCCAAGGA G CCUCCCCA 



TGGGGAGG GGCTAGCTACAACGA TCCTTGGC 



UUCUCAGG G CCAGUGUC 



GACACTGG GGCTAGCTACAACGA CCTGAGAA 



CAGGGCCA G UGUCACAG 



CTGTGACA GGCTAGCTACAACGA TGGCCCTG 



GUGUCACA G CCCUGAUU 



AATCAGGG GGCTAGCTACAACGA TGTGACAC 



2248 
2263 



UUGAAUCA G UGAAUGGA 



TCCATTCA GGCTAGCTACAACGA TGATTCAA 



GAAAAACA G UUACCUUG 



CAAGGTAA GGCTAGCTACAACGA TGTTTTTC 



AUAAUGGA G CAGGUGCU 



AGCACCTG GGCTAGCTACAACGA TCCATTAT 



229 



UGGAGCAG G UGCUGAUG 



CATCAGCA GGCTAGCTACAACGA CTGCTCCA 



231B 
2331 
2357 
2366 
2374 
2380 
2392 
2401 
2413 
2424 



GGAUGACG G UGUCUACU 



AGTAGACA GGCTAGCTACAACGA CGTCATCC 



UACUCAAG G UAUUUCAC 



GTGAAATA GGCTAGCTACAACGA CTTGAGTA 



CACGAAUG G UAGAUACA 



TGTATCTA GGCTAGCTACAACGA CATTCGTG 



UAGAUACA G UGUAAAAG 



CTTTTACA GGCTAGCTACAACGA TGTATCTA 



GUGUAAAA G UGCGGGCU 



AGCCCGCA GGCTAGCTACAACGA TTTTACAC 



AAGUGCGG G CUCUGGGA 



TCCCAGAG GGCTAGCTACAACGA CCGCACTT 



UGGGAGGA G UUAACGCA 



TGCGTTAA GGCTAGCTACAACGA TCCTCCCA 



UUAACGCA G CCAGACGG 



CCGTCTGG GGCTAGCTACAACGA TGCGTTAA 



GACGGAGA G UGAUACCC 



GGGTATCA GGCTAGCTACAACGA TCTCCGTC 



AUACCCCA G CAGAGUGG 



CCACTCTG GGCTAGCTACAACGA TGGGGTAT 



2429 



CCAGCAGA G UGGAGCAC 



GTGCTCCA GGCTAGCTACAACGA TCTGCTGG 



AGAGUGGA G CACUGUAC 



GTACAGTG GGCTAGCTACAACGA TCCACTCT 



CAUACCUG G CUGGAUUG 



CAATCCAG GGCTAGCTACAACGA CAGGTATG 



CAACACAA G CAAGUGUG 



CACACTTG GGCTAGCTACAACGA TTGTGTTG 



ACAAGCAA G UGUGUUUC 



GAAACACA GGCTAGCTACAACGA TTGCTTGT 



GUGUUUCA G CAGAACAU 



ATGTTCTG GGCTAGCTACAACGA TGAAACAC 



CUCGGGAG G CUCAUUUG 



CAAATGAG GGCTAGCTACAACGA CTCCCGAG 



CAUUUGUG G CUUCUGAU 



ATCAGAAG GGCTAGCTACAACGA CACAAATG 



CCCACCUG G CCAAAUCA 



TGATTTGG GGCTAGCTACAACGA CAGGTGGG 



ACCUGAAG G CGGAAAUU 



AATTTCCG GGCTAGCTACAACGA CTTCAGGT 



UCACGGGG G CAGUCUCA 



TGAGACTG GGCTAGCTACAACGA CCCCGTGA 



CGGGGGCA G UCUCAUUA 



TAATGAGA GGCTAGCTACAACGA TGCCCCCG 



CUUGGACA G CUCCUGGG 



CCCAGGAG GGCTAGCTACAACGA TGTCCAAG 



AUGGAACA G CUCACAAG 



CTTGTGAG GGCTAGCTACAACGA TGTTCCAT 



GCUCACAA G UAUAUCAU 



ATGATATA GGCTAGCTACAACGA TTGTGAGC 



UCGAAUAA G UACAAGUA 



TACTTGTA GGCTAGCTACAACGA TTATTCGA 



2735 
2757 



AAGUACAA G UAUUCUUG 



CAAGAATA GGCTAGCTACAACGA TTGTACTT 



AGAGACAA G UUCAAUGA 



TCATTGAA GGCTAGCTACAACGA TTGTCTCT 



2776 
2806 
2821 
2861 



CUCUUCAA G UGAAUACU 



AGTATTCA GGCTAGCTACAACGA TTGAAGAG 



CAAAGGAA G CCAACUCU 



AGAGTTGG GGCTAGCTACAACGA TTCCTTTG 



CUGAGGAA G UCUUUUUG 



CAAAAAGA GGCTAGCTACAACGA TTCCTCAG 



UGAAAAUG G CACAGAUC 



GATCTGTG GGCTAGCTACAACGA CATTTTCA 



CUAUUCAG G CUGUUGAU 



ATCAACAG GGCTAGCTACAACGA CTGAATAG 



2899 
2935 



UUGAUAAG G UCGAUCUG 



CAGATCGA GGCTAGCTACAACGA CTTATCAA 



UUGCACGA G UAUCUUUG 



CAAAGATA GGCTAGCTACAACGA TCGTGCAA 



GACACCUA G UCCUGAUG 



CATC AG GA GGCTAGCTACAACGA TAGGTGTC 



GAUGAAAC G UCUGCUCC 



GGAGCAGA GGCTAGCTACAACGA GTTTCATC 



UAUCAACA G CACCAUUC 



GAATGGTG GGCTAGCTACAACGA TGTTGATA 



CAUUCCUG G CAUUCACA 



TGTGAATG GGCTAGCTACAACGA CAGGAATG 



AUGUGGAA G UGGAUAGG 



CCTATCCA GGCTAGCTACAACGA TTCCACAT 



GAACUGCA G CUGUCAAU 



ATTGACAG GGCTAGCTACAACGA TGCAGTTC 
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UGUCAAUA G CCUAGGGC 



GCCCTAGG GGCTAGCTACAACGA TATTGACA 



AGCCUAGG G CUGAAUUU 



AAATTCAG GGCTAGCTACAACGA CCTAGGCT 



UGUAGGGG G CGAUAUAC 



GTATATCG GGCTAGCTACAACGA CCCCTACA 



UGUAGGGG G CGAUAUAC 



GTATATCG GGCTAGCTACAACGA CCCCTACA 



UGUAUAUA G UACAUUUA 



TAAATGTA GGCTAGCTACAACGA TATATACA 



UGUAGGGG G CGAUAAAA 



TTTTATCG GGCTAGCTACAACGA CCCCTACA 



UGGUACAA A UGGAUGUG 



CACATCCA GGCTAGCTACAACGA TTGTACCA 



ACAAAUGG A UGUGGAAU 



ATTCCACA GGCTAGCTACAACGA CCATTTGT 



GAUGUGGA A UAUAAUUG 



CAATTATA GGCTAGCTACAACGA TCCACATC 



GGAAUAUA A UUGAAUAU 



ATATTCAA GGCTAGCTACAACGA TATATTCC 



AUAAUUGA A UAUUUUCU 



AGAAAATA GGCTAGCTACAACGA TCAATTAT 



114 
120 
137 
144 
150 
176 
185 
193 
217 
224 
228 



GAAGGCAG A UGGAAAUA 



TATTTCCA GGCTAGCTACAACGA CTGCCTTC 



AGAUGGAA A UAUUUACA 



TGTAAATA GGCTAGCTACAACGA TTCCATCT 



AGUACGCA A UUUGAGAC 



GTCTCAAA GGCTAGCTACAACGA TGCGTACT 



AAUUUGAG A CUAAGAUA 



TATCTTAG GGCTAGCTACAACGA CTCAAATT 



AGACUAAG A UAUUGUUA 



TAACAATA GGCTAGCTACAACGA CTTAGTCT 



UAUUGAAG A CAAGAGCA 



TGCTCTTG GGCTAGCTACAACGA CTTCAATA 



CAAGAGCA A UAGUAAAA 



TTTTACTA GGCTAGCTACAACGA TGCTCTTG 



AUAGUAAA A CACAUCAG 



CTGATGTG GGCTAGCTACAACGA TTTACTAT 



GGUUAAAG A CCUGUGAU 



ATCACAGG GGCTAGCTACAACGA CTTTAACC 



GACCUGUG A UAAACCAC 



GTGGTTTA GGCTAGCTACAACGA CACAGGTC 



UGUGAUAA A CCACUUCC 



GGAAGTGG GGCTAGCTACAACGA TTATCACA 



CACUUCCG A UAAGUUGG 



CCAACTTA GGCTAGCTACAACGA CGGAAGTG 



249 
284 
297 
308 
331 
342 
352 
385 



AGUUGGAA A CGUGUGUC 



GACACACG GGCTAGCTACAACGA TTCCAACT 



UAUAUAUA A UGGUAAAG 



CTTTACCA GGCTAGCTACAACGA TATATATA 



AAAGAAAG A CACCUUCG 



CGAAGGTG GGCTAGCTACAACGA CTTTCTTT 



CCUUCGUA A CCCGCAUU 



AATGCGGG GGCTAGCTACAACGA TACGAAGG 



AGAGAGGA A UCACAGGG 



CCCTGTGA GGCTAGCTACAACGA TCCTCTCT 



ACAGGGAG A UGUACAGC 



GCTGTACA GGCTAGCTACAACGA CTCCCTGT 



GUACAGCA A UGGGGCCA 



TGGCCCCA GGCTAGCTACAACGA TGCTGTAC 



UCAUCUUG A UUCUUCAC 



GTGAAGAA GGCTAGCTACAACGA CAAGATGA 



CCUGAGUA A UUCACUCA 



TGAGTGAA GGCTAGCTACAACGA TACTCAGG 



434 
437 



UCAGCUGA A CAACAAUG 



CATTGTTG GGCTAGCTACAACGA TCAGCTGA 



GCUGAACA A CAAUGGCU 



AGCCATTG GGCTAGCTACAACGA TGTTCAGC 



GAACAACA A UGGCUAUG 



CATAGCCA GGCTAGCTACAACGA ' 



466 
470 
476 
488 
493 
504 
508 
515 
523 
564 
578 
592 
601 
610 
620 
630 
636 
643 
653 
659 
692 



UCGUUGCA A UCGACCCC 



GGGGTCGA GGCTAGCTACAACGA TGCAACGA 



UGCAAUCG A CCCCAAUG 



CATTGGGG GGCTAGCTACAACGA CGATTGCA 



CGACCCCA A UGUGCCAG 



CTGGCACA GGCTAGCTACAACGA TGGGGTCG 



GCCAGAAG A UGAAACAC 



GTGTTTCA GGCTAGCTACAACGA CTTCTGGC 



AAGAUGAA A CACUCAUU 



AATGAGTG GGCTAGCTACAACGA TTCATCTT 



CUCAUUCA A CAAAUAAA 



TTTATTTG GGCTAGCTACAACGA TGAATGAG 



UUCAACAA A UAAAGGAC 



GTCCTTTA GGCTAGCTACAACGA TTGTTGAA 



AAUAAAGG A CAUGGUGA 



TCACCATG GGCTAGCTACAACGA CCTTTATT 



ACAUGGUG A CCCAGGCA 



TGCCTGGG GGCTAGCTACAACGA CACCATGT 



GGAAAGCG A UUUUAUUU 



AAATAAAA GGCTAGCTACAACGA CGCTTTCC 



UUUCAAAA A UGUUGCCA 



TGGCAACA GGCTAGCTACAACGA TTTTGAAA 



CCAUUUUG A UUCCUGAA 



TTCAGGAA GGCTAGCTACAACGA CAAAATGG 



UUCCUGAA A CAUGGAAG 



CTTCCATG GGCTAGCTACAACGA TTCAGGAA 



CAUGGAAG A CAAAGGCU 



AGCCTTTG GGCTAGCTACAACGA CTTCCATG 



AAAGGCUG A CUAUGUGA 



TCACATAG GGCTAGCTACAACGA CAGCCTTT 



UAUGUGAG A CCAAAACU 



AGTTTTGG GGCTAGCTACAACGA CTCACATA 



AG AC C AAA A CUUGAGAC 



GTCTCAAG GGCTAGCTACAACGA TTTGGTCT 



AACUUGAG A CCUACAAA 



TTTGTAGG GGCTAGCTACAACGA CTCAAGTT 



CUACAAAA A UGCUGAUG 



CATCAGCA GGCTAGCTACAACGA TTTTGTAG 



AAAUGCUG A UGUUCUGG 



CCAGAACA GGCTAGCTACAACGA CAGCATTT 



UCCAGGUA A UGAUGAAC 



GTTCATCA GGCTAGCTACAACGA TACCTGGA 
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AGGUAAUG A UGAACCCU 



AGGGTTCA GGCTAGC TACAACGA CATTACCT 



AAUGAUGA A CCCUACAC 



GTGTAGGG GGCTAGCTACAACGA TCATCATT 



715 
722 
745 
761 



CUGAGCAG A UGGGCAAC 



GTTGCCCA GGCTAGCTACAACGA CTGCTCAG 



GAUGGGCA A CUGUGGAG 



CTCCACAG GGCTAGCTACAACGA TGCCCATC 



GUGAAAGG A UCCACCUC 



GAGGTGGA GGCTAGCTACAACGA CCTTTCAC 



CACUCCUG A UUUCAUUG 



CAATGAAA GGCTAGCTACAACGA CAGGAGTG 



UUAGCUGA A UAUGGACC 



GGTCCATA GGCTAGCTACAACGA TCAGCTAA 



GAAUAUGG A CCACAAGG 



CCTTGTGG GGCTAGCTACAACGA CCATATTC 



CAUCUACG A UGGGGAGU 



ACTCCCCA GGCTAGCTACAACGA CGTAGATG 



AGUAUUUG A CGAGUACA 



TGTACTCG GGCTAGCTACAACGA CAAATACT 



CGAGUACA A UAAUGAUG 



CATCATTA GGCTAGCTACAACGA TGTACTCG 



GUACAAUA A UGAUGAGA 



TCTCATCA GGCTAGCTACAACGA TATTGTAC 



CAAUAAUG A UGAGAAAU 



ATTTCTCA GGCTAGCTACAACGA CATTATTG 



GAUGAGAA A UUCUACUU 



AAGTAGAA GGCTAGCTACAACGA TTCTCATC 



887 
895 



CUUAUCCA A UGGAAGAA 



TTCTTCCA GGCTAGCTACAACGA TGGATAAG 



AUGGAAGA A UACAAGCA 



TGCTTGTA GGCTAGCTACAACGA TCTTCCAT 



GCAGUAAG A UGUUCAGC 



GCTGAACA GGCTAGCTACAACGA CTTACTGC 



935 
978 



UGGUACAA A UGUAGUAA 



TTACTACA GGCTAGCTACAACGA TTGTACCA 



ACCAAAAG A UGCACAUU 



AATGTGCA GGCTAGCTACAACGA CTTTTGGT 



CACAUUCA A UAAAGUUA 



TAACTTTA GGCTAGCTACAACGA TGAATGTG 



GUUACAGG A CUCUAUGA 



TCATAGAG GGCTAGCTACAACGA CCTGTAAC 



GAAAAAGG A UGUGAGUU 



AACTCACA GGCTAGCTACAACGA i 



GUUCUCCA A UCCCGCCA 



TGGCGGGA GGCTAGCTACAACGA TGGAGAAC 



CCCGCCAG A CGGAGAAG 



CTTCTCCG GGCTAGCTACAACGA CTGGCGGG 



CUUCUAUA A UGUUUGCA 



TGCAAACA GGCTAGCTACAACGA TATAGAAG 



UUUGCACA A CAUGUUGA 



TCAACATG GGCTAGCTACAACGA TGTGCAAA 



ACAUGUUG A UUCUAUAG 



CTATAGAA GGCTAGCTACAACGA CAACATGT 



AUAGUUGA A UUCUGUAC 



GTACAGAA GGCTAGCTACAACGA TCAACTAT 



UGUACAGA A CAAAACCA 



TGGTTTTG GGCTAGCTACAACGA TCTGTACA 



AGAACAAA A CCACAACA 



TGTTGTGG GGCTAGCTACAACGA TTTGTTCT 



AAACCACA A CAAAGAAG 



CTTCTTTG GGCTAGCTACAACGA TGTGGTTT 



AGCUCCAA A CAAGCAAA 



TTTGCTTG GGCTAGCTACAACGA TTGGAGCT 



CAAGCAAA A UCAAAAAU 



ATTTTTGA GGCTAGCTACAACGA TTTGCTTG 



AAUCAAAA A UGCAAUCU 



AGATTGCA GGCTAGCTACAACGA TTTTGATT 



AAAAUGCA A UCUCCGAA 



TTCGGAGA GGCTAGCTACAACGA TGCATTTT 



GGGAAGUG A UCCGUGAU 



ATCACGGA GGCTAGCTACAACGA CACTTCCC 



GAUCCGUG A UUCUGAGG 



CCTCAGAA GGCTAGCTACAACGA CACGGATC 



UUCUGAGG A CUUUAAGA 



TCTTAAAG GGCTAGCTACAACGA CCTCAGAA 



UUAAGAAA A CCACUCCU 



AGGAGTGG GGCTAGCTACAACGA TTTCTTAA 



CUCCUAUG A CAACACAG 



CTGTGTTG GGCTAGC TACAACGA CATAGGAG 



CUAUGACA A CACAGCCA 



TGGCTGTG GGCTAGCTACAACGA TGTCATAG 



GCCACCAA A UCCCACCU 



AGGTGGGA GGCTAGCTACAACGA TTGGTGGC 



UGCUGCAG A UUGGACAA 



TTGTCCAA GGCTAGCTACAACGA CTGCAGCA 



CAGAUUGG A CAAAGAAU 



ATTCTTTG GGCTAGCTACAACGA CCAATCTG 



GACAAAGA A UUGUGUGU 



ACACACAA GGCTAGCTACAACGA TCTTTGTC 



AGUCCUUG A CAAAUCUG 



CAGATTTG GGCTAGCTACAACGA CAAGGACT 



CUUGACAA A UCUGGAAG 



CTTCCAGA GGCTAGCTACAACGA TTGTCAAG 



GCAUGGCG A CUGGUAAC 



GTTACCAG GGCTAGCTACAACGA CGCCATGC 



GACUGGUA A CCGCCUCA 



TGAGGCGG GGCTAGCTACAACGA TACCAGTC 



CCGCCUCA A UCGACUGA 



TCAGTCGA GGCTAGCTACAACGA TGAGGCGG 



CUCAAUCG A CUGAAUCA 



TGATTCAG GGCTAGCTACAACGA CGATTGAG 



UCGACUGA A UCAAGCAG 



CTGCTTGA GGCTAGCTACAACGA TCAGTCGA 



UGCUGCAG A CAGUUGAG 



CTCAACTG GGCTAGCTACAACGA CTGCAGCA 



GGGUUGGG A UGGUGACA 



TGTCACCA GGCTAGCTACAACGA CCCAACCC 



GGAUGGUG A CAUUUGAC 



GTCAAATG GGCTAGCTACAACGA CACCATCC 



GACAUUUG A CAGUGCUG 



CAGCACTG GGCTAGCTACAACGA CAAATGTC 
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CAAAGUGA A CUCAUACA 



TGTATGAG GGCTAGCTACAACGA TCACTTTG 



UCAUACAG A UAAACAGU 



ACTGTTTA GGCTAGCTACAACGA CTGTATGA 



ACAGAUAA A CAGUGGCA 



TGCCACTG GGCTAGCTACAACGA TTATCTGT 



UGGCAGUG A CAGGGACA 



TGTCCCTG GGCTAGCTACAACGA CACTGCCA 



UGACAGGG A CACACUCG 



CGAGTGTG GGCTAGCTACAACGA CCCTGTCA 



GCCAAAAG A UUACCUGC 



GCAGGTAA GGCTAGCTACAACGA CTTTTGGC 



CAGGAGGG A CGUCCAUC 



GATGGACG GGCTAGCTACAACGA CCCTCCTG 



1521 
1537 



GGGCUUCG A UCGGCAUU 



AATGCCGA GGCTAGCTACAACGA CGAAGCCC 



UUACUGUG A UUAGGAAG 



CTTCCTAA GGCTAGCTACAACGA CACAGTAA 



1548 
1555 
1559 
1563 
1570 
1582 
1586 
1595 
1598 
1619 
1629 
1683 
1702 



AGGAAGAA A UAUCCAAC 



GTTGGATA GGCTAGCTACAACGA TTCTTCCT 



AAUAUCCA A CUGAUGGA 



TCCATCAG GGCTAGCTACAACGA TGGATATT 



UCCAACUG A UGGAUCUG 



CAGATCCA GGCTAGCTACAACGA CAGTTGGA 



ACUGAUGG A UCUGAAAU 



ATTTCAGA GGCTAGCTACAACGA CCATCAGT 



GAUCUGAA A UUGUGCUG 



CAGCACAA GGCTAGCTACAACGA TTCAGATC 



UGCUGCUG A CGGAUGGG 



CCCATCCG GGCTAGCTACAACGA CAGCAGCA 



GCUGACGG A UGGGGAAG 



CTTCCCCA GGCTAGCTACAACGA CCGTCAGC 



TAGTGTTG GGCTAGCTACAACGA CTTCCCCA 



GGAAGACA A CACUAUAA 



TTATAGTG GGCTAGCTACAACGA TGTCTTCC 



GUGCUUUA A CGAGGUCA 



TGACCTCG GGCTAGCTACAACGA TAAAGCAC 



GAGGUCAA A CAAAGUGG 



CCACTTTG GGCTAGCTACAACGA TTGACCTC 



GCUCAAGA A CUAGAGGA 



TCCTCTAG GGCTAGCTACAACGA TCTTGAGC 



UGUCCAAA A UGACAGGA 



TCCTGTCA GGCTAGCTACAACGA TTTGGACA 
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CCAAAAUG A CAGGAGGU 



ACCTCCTG GGCTAGCTACAACGA CATTTTGG 



GUUUACAG A CAUAUGCU 



AGCATATG GGCTAGCTACAACGA CTGTAAAC 



UGCUUCAG A UCAAGUUC 



GAACTTGA GGCTAGCTACAACGA CTGAAGCA 



AGUUCAGA A CAAUGGCC 



GGCCATTG GGCTAGCTACAACGA TCTGAACT 



UCAGAACA A UGGCCUCA 



TGAGGCCA GGCTAGCTACAACGA TGTTCTGA 



CCUCAUUG A UGCUUUUG 



CAAAAGCA GGCTAGCTACAACGA CAATGAGG 



AUCAGGAA A UGGAGCUG 



CAGCTCCA GGCTAGCTACAACGA TTCCTGAT 



AGUAAGGG A UUAACCCU 



AGGGTTAA GGCTAGCTACAACGA CCCTTACT 



AGGGAUUA A CCCUCCAG 



CTGGAGGG GGCTAGCTACAACGA TAATCCCT 



CCUCCAGA A CAGCCAGU 



ACTGGCTG GGCTAGCTACAACGA TCTGGAGG 



GCCAGUGG A UGAAUGGC 



GCCATTCA GGCTAGCTACAACGA CCACTGGC 



GUGGAUGA A UGGCACAG 



CTGTGCCA GGCTAGCTACAACGA TCATCCAC 



GCACAGUG A UCGUGGAC 



GTCCACGA GGCTAGCTACAACGA CACTGTGC 



GAUCGUGG A CAGCACCG 



CGGTGCTG GGCTAGCTACAACGA CCACGATC 



GGGAAAGG A CACUUUGU 



ACAAAGTG GGCTAGCTACAACGA CCTTTCCC 



UCACCUGG A CAACGCAG 



CTGCGTTG GGCTAGCTACAACGA CCAGGTGA 



CCUGGACA A CGCAGCCU 



AGGCTGCG GGCTAGCTACAACGA TGTCCAGG 



CUCCCCAA A UCCUUCUC 



GAGAAGGA GGCTAGCTACAACGA TTGGGGAG 



UCUCUGGG A UCCCAGUG 



CACTGGGA GGCTAGCTACAACGA CCCAGAGA 



CCCAGUGG A CAGAAGCA 



TGCTTCTG GGCTAGCTACAACGA CCACTGGG 



UGUAGUGG A CAAAAACA 



TGTTTTTG GGCTAGCTACAACGA CCACTACA 



GGACAAAA A CACCAAAA 



TTTTGGTG GGCTAGCTACAACGA TTTTGTCC 



AC AC C AAA A UGGCCUAC 



GTAGGCCA GGCTAGCTACAACGA TTTGGTGT 



ACCUCCAA A UCCCAGGC 



GCCTGGGA GGCTAGCTACAACGA TTGGAGGT 



ACUUGGAA A UACAGUCU 



AGACTGTA GGCTAGCTACAACGA TTCCAAGT 



GCUCACAA A CCUUGACC 



GGTCAAGG GGCTAGCTACAACGA TTGTGAGC 



AAACCUUG A CCCUGACU 



AGTCAGGG GGCTAGCTACAACGA CAAGGTTT 



UGACCCUG A CUGUCACG 



CGTGACAG GGCTAGCTACAACGA CAGGGTCA 



UGCGUCCA A UGCUACCC 



GGGTAGCA GGCTAGCTACAACGA TGGACGCA 



UGCCUCCA A UUACAGUG 



CACTGTAA GGCTAGCTACAACGA TGGAGGCA 



UUACAGUG A CUUCCAAA 



TTTGGAAG GGCTAGCTACAACGA CACTGTAA 



CUUCCAAA A CGAACAAG 



CTTGTTCG GGCTAGCTACAACGA TTTGGAAG 



CAAAACGA A CAAGGACA 



TGTCCTTG GGCTAGCTACAACGA TCGTTTTG 



GAACAAGG A CACCAGCA 



TGCTGGTG GGCTAGCTACAACGA CCTTGTTC 



113 



2160 


ACCAGCAA 


— 


UUCCCCAG 


1778 


CTGGGGAA 


GGCTAGCTACAACGA 


TTGCTGGT 


4582 


2189 


UUAUG C AA 


— 


UAUUCGCC 




GGCGAATA 


GGCTAGCTACAACGA 


TTGCATAA 


4583 


2212 


CCUCCCCA 


— 


UUCUCAGG 




CCTGAGAA 


GGCTAGCTACAACGA 


TGGGGAGG 


4584 


2239 


CAGCCCUG 


— 


UUGAAUCA 


1781 


TGATTCAA 


GGCTAGCTACAACGA 


CAGGGCTG 


4585 


2244 


CUGAUUGA 


A 


UCAGUGAA 


1782 


TTCACTGA 


GGCTAGCTACAACGA 


TCAATCAG 


4586 


2252 


AUCAGUGA 


— 


UGGAAAAA 




TTTTTCCA 


GGCTAGCTACAACGA 


TCACTGAT 


4587 


2260 


AUGGAAAA 


— 


CAGUUACC 


1784 


GGTAACTG 


GGCTAGCTACAACGA 


TTTTCCAT 


4588 


2274 


ACCUUGGA 


A 


CUACUGGA 


1785 


TCCAGTAG 


GGCTAGCTACAACGA 


TCCAAGGT 


4589 


2282 


ACUACUGG 


A 


UAAUGGAG 


1786 


CTCCATTA 


GGCTAGCTACAACGA 


CCAGTAGT 


4590 


2285 


ACUGGAUA 


A 


UGGAGCAG 


1787 


CTGCTCCA 


GGCTAGCTACAACGA 


TATCCAGT 


4591 


2300 


AGGUGCUG 


A 


UGCUACUA 


1788 


TAGTAGCA 


GGCTAGCTACAACGA 


CAGCACCT 


4592 


2312 


UACUAAGG 


— 


UGACGGUG 


1789 


CACCGTCA 


GGCTAGCTACAACGA 


CCTTAGTA 


4593 


LrL 


UAAGGAUG 


— 


CGGUGUCU 


1790 


AGACACCG 


GGCTAGCTACAACGA 


CATCCTTA 


4594 




AUUUCACA 


— 


CUUAUGAC 




GT CAT AAG 


GGCTAGCTACAACGA 


TGTGAAAT 


4595 




AACUUAUG 


— 


C AC G AAUG 




CATTCGTG 


GGCTAGCTACAACGA 


CATAAGTT 


4596 


2354 


UGACACGA 


— 


UGGUAGAU 


1793 


ATCTACCA 


GGCTAGCTACAACGA 


TCGTGTCA 


4597 




AAUG G U AG 


— 


UACAGUGU 


1794 


ACACTGTA 


GGCTAGCTACAACGA 


CTACCATT 


4598 


2396 


AGGAGUUA 


A 


CGCAGCCA 


1795 


TGGCTGCG 


GGCTAGCTACAACGA 


TAACTCCT 


4599 


2406 


GCAGCCAG 


A 


CGGAGAGU 


1796 


ACTCTCCG 


GGCTAGCTACAACGA 


CTGGCTGC 


4600 


2416 


GGAGAGUG 


— 


UACCCCAG 


1797 


CTGGGGTA 


GGC TAGCTACAACGA 


CACTCTCC 


4601 




CUGGCUGG 


A 


UUGAGAAU 


1798 


ATTCTCAA 


GGCTAGCTACAACGA 


CCAGCCAG 


4602 


2462 


GAUUGAGA 


A 


UGAUGAAA 


1799 


TTTCATCA 


GGCTAGCTACAACGA 


TCTCAATC 


4603 


2465 


UGAGAAUG 


— 


UGAAAUAC 


1800 


GTATTTCA 


GGCTAGCTACAACGA 


CATTCTCA 


4604 




AUGAUGAA 


— 


UACAAUGG 


1801 


CCATTGTA 


GGCTAGCTACAACGA 


TTCATCAT 


4605 


2475 


GAAAUACA 


— 


UGGAAUCC 


1802 


GGATTCCA 


GGCTAGCTACAACGA 


TGTATTTC 


4606 




ACAAUGGA 


— 


UCCACCAA 


1803 


TTGGTGGA 


GGCTAGCTACAACGA 


TCCATTGT 


4607 


2490 


C C AC C AAG 


— 


CCUGAAAU 


1804 


ATTTCAGG 


GGCTAGCTACAACGA 


CTTGGTGG 


4608 




GACCUGAA 


— 


UUAAUAAG 


1805 


CTTATTAA 


GGCTAGCTACAACGA 


TTCAGGTC 


4609 


2501 


UGAAAUUA 


— 


UAAGGAUG 


1806 


CATCCTTA 


GGCTAGCTACAACGA 


TAATTTCA 


4610 


2507 


UAAUAAGG 


A 


UGAUGUUC 


1807 


GAACATCA 


GGCTAGCTACAACGA 


CCTTATTA 


4611 




UAAGGAUG 


A 


UGUUCAAC 


„ 1808 


GTTGAACA 


GGCTAGCTACAACGA 


CATCCTTA 


4612 


2517 


GAUGUUCA 


A 


CACAAGCA 


1809 


TGCTTGTG 


GGCTAGCTACAACGA 


TGAACATC 


4613 


2542 


UCAGCAGA 


A 


CAUCCUCG 


181 9 


CGAGGATG 


GGCTAGCTACAACGA 


TCTGCTGA 


4614 


2573 


GGCUUCUG 


— 


UGUCCCAA 


1811 


TTGGGACA 


GGCTAGCTACAACGA 


CAGAAGCC 


4615 


2582 


UGUCCCAA 


A 


UGCUCCCA 


1812 


TGGGAGCA 


GGCTAGCTACAACGA 


TTGGGACA 


4616 




CAUACCUG 


A 


UCUCUUCC 


1813 


GGAAGAGA 


GGCTAGCTACAACGA 


CAGGTATG 


4617 




CUGGCCAA 


— 


UCACCGAC 


1814 


GTCGGTGA 


GGCTAGCTACAACGA 


TTGGCCAG 


4618 


2624 


AAUCACCG 


— 


CCUGAAGG 


1815 


CCTTCAGG 


GGCTAGCTACAACGA 


CGGTGATT 


4619 


2638 


AGGCGGAA 


— 


UUCACGGG 




CCCGTGAA 


GGCTAGCTACAACGA 


TTCCGCCT 


4620 


— 


UCUCAUUA 


— 


UCUGACUU 




AAGTCAGA 


GGCTAGCTACAACGA 


TAATGAGA 


4621 





UUAAUCUG 


— 


CUUGGACA 


1818 


TGTCCAAG 


GGCTAGCTACAACGA 


CAGATTAA 


4622 


2671 


UGACUUGG 


— 


CAGCUCCU 


1819 


AGGAGCTG 


GGCTAGCTACAACGA 


CCAAGTCA 


4623 


2684 


UC CUGGGG 


— 


UGAUUAUG 


1820 


CATAATCA 


GGCTAGCTACAACGA 


CCCCAGGA 


4624 





UGGGGAUG 


— 


UUAUGACC 


1821 


GGTCATAA 


GGCTAGCTACAACGA 


CATCCCCA 


4625 


2693 


UGAUUAUG 


— 


CCAUGGAA 


1822 


TTCCATGG 


GGCTAGCTACAACGA 


CATAATCA 


4626 


2701 


ACCAUGGA 


— 


CAGCUCAC 


1823 


GTGAGCTG 


GGCTAGCTACAACGA 


TCCATGGT 


4627 


2725 


UCAUUCGA 


— 


UAAGUACA 


1824 


TGTACTTA 


GGCTAGCTACAACGA 


TCGAATGA 


4628 


2744 


UAUUCUUG 


— 


UCUCAGAG 


1825 


CTCTGAGA 


GGCTAGCTACAACGA 


CAAGAATA 


4629 


2753 


UCUCAGAG 


— 


CAAGUUCA 


1826 


TGAACTTG 


GGCTAGCTACAACGA 


CTCTGAGA 


4630 





CAAGUUCA 


— 


UGAAUCUC 


1827 


GAGATTCA 


GGCTAGCTACAACGA 


TGAACTTG 


4631 


27 ^ 


UUCAAUGA 


A 


UCUCUUCA 


1828 


TGAAGAGA 


GGCTAGCTACAACGA 


TCATTGAA 


4632 




UCAAGUGA 


A 


UACUACUG 


1829 


CAGTAGTA 


GGCTAGCTACAACGA 


TCACTTGA 


4633 


2810 


GGAAGCCA 


A 


CUCUGAGG 




CCTCAGAG 




TGGCTTCC 




2835 


UUGUUUAA 


A 


CCAGAAAA 


1831 


TTTTCTGG 


GGCTAGCTACAACGA 


TTAAACAA 


4635 


2843 


ACCAGAAA 


A 


CAUUACUU 


1832 


AAGTAATG 


GGCTAGCTACAACGA 


TTTCTGGT 


4636 


2858 


UUUUGAAA 


A 


UGGCACAG 


1833 


CTGTGCCA 


GGCTAGCTACAACGA 


TTTCAAAA 


4637 



114 





UGGCACAG 


— 


UCUUTJUCA 


1834 


TGAAAAGA 


GGCTAGCTACAACGA 


CTGTGCCA 


4638 


2894 


GG CUGUUG 


— 


UAAG GUC G 


1835 


CGACCTTA 


GGCTAGCTACAACGA 


CAACAGCC 


4639 


2903 


UAAGGUC G 


— 


U CUGAAAU 


1836 


ATTTCAGA 


GGCTAGCTACAACGA 


CGACCTTA 


4640 


2910 


GAUCUGAA 


— 


U C AG AAAU 




ATTTCTGA 


GGCTAGCTACAACGA 


TTCAGATC 


4641 


2917 


AAUCAGAA 


— 




1838 


GTTGGATA 


GGCTAGCTACAACGA 


TTCTGATT 


4642 


2924 


AAUAUC C A 


— 


C AUUG C AC 


1839 


GTGCAATG 


GGCTAGCTACAACGA 


TGGATATT 


4643 


2959 


CUC C AC AG 


— 


CUCCGCCA 


184 °- 


TGGCGGAG 


GGCTAGCTACAACGA 


CTGTGGAG 


4644 


2971 




— 


C AC CUAGU 


1841 


ACTAGGTG 


GGCTAGCTACAACGA 


CTCTGGCG 


4645 


2984 


UAGUCCUG 


— 


UGAAACGU 




ACGTTTCA 


GGCTAGCTACAACGA 


CAGGACTA 


4646 


29 


CUGAUGAA 


— 


C GU CUG CU 




AGCAGACG 


GGCTAGCTACAACGA 


TTCATCAG 


4647 




UUbULLUA 


— 


UAUUCAUA 




TATGAATA 


GGCTAGCTACAACGA 


TAGGACAA 


4648 


3 


UC AUAU C A 


— 


C AGC AC CA 


1845 


TGGTGCTG 


GGCTAGCTACAACGA 


TGATATGA 


4649 


3052 


UUUUAAAA 


— 


UUAUGUGG 


1846 


CCACATAA 


GGCTAGCTACAACGA 


TTTTAAAA 


4650 


3067 
3 ° 7 7 








1847 


TTCTCCTA 


GGCTAGCTACAACGA 


CCACTTCC 


4651 




AUAGGAGA 




CUGCAGCU 




AGCTGCAG 


GGCTAGCTACAACGA 


TCTCCTAT 


4652 


3088 


AGC UGU C A 






1849 


CTAGGCTA 


GGCTAGCTACAACGA 


TGACAGCT 


4653 


3103 
— — — 


AGGGCUGA 


— 


UUUUUGUC 


1850 


GACAAAAA 


GGCTAGCTACAACGA 


TCAGCCCT 


4654 




T TT TTTr 1 TTY" 1 a r* 1 
UU UVaUU-ttAj 


— 


UAAAUAAA 


1851 


TTTATTTA 


GGCTAGCTACAACGA 


CTGACAAA 


4655 




UCAGAUAA 


— 


UAAAAUAA 


1852 


TTATTTTA 


GGCTAGCTACAACGA 


TTATCTGA 


4656 


~312 — 



— UAAAUAAA 


— 


UAAAUCAU 


1853 


ATGATTTA 


GGCTAGCTACAACGA 


TTTATTTA 


4657 




UAAAAUAA 


— 


UCAUUCAU 


1854 


ATGAATGA 


GGCTAGCTACAACGA 


TTATTTTA 


4658 


— 


UUUUUUUG 


— 


UUAUAAAA 


1855 


TTTTATAA 


GGCTAGCTACAACGA 


CAAAAAAA 


4659 




AUUAUAAA 




UUUUCUAA 


1856 


TTAGAAAA 


GGCTAGCTACAACGA 


TTTATAAT 


4660 


— 


UUUCUAAA 


— 




1857 


AAAATACA 


GGCTAGCTACAACGA 


TTTAGAAA 


4661 


32 ?5 


UAUUUUAG 


— 


CUUCCUGU 


1858 


ACAGGAAG 


GGCTAGCTACAACGA 


CTAAAATA 


4662 






— 


CUUCCUGU 




ACAGGAAG 


GGCTAGCTACAACGA 


CTAAAATA 


4662 




AGGGGGCG 


— 


UAUACUAA 


1859 


TTAGTATA 


GGCTAGCTACAACGA 


CGCCCCCT 


4663 


3245 


AGGGGGCG 


— 


UAUACUAA 


1859 


T TAG TATA 


GGCTAGCTACAACGA 


CGCCCCCT 


4663 


32Q1 


UAUACUAA 


— 




1860 


TATATACA 


GGCTAGCTACAACGA 


TTAGTATA 


4664 


3225 


UAUACUAA 


— 


TTPTTZiTTTTPP 


1861 


GGAATACA 


GGCTAGCTACAACGA 


TTAGTATA 


4665 


T2 


UAUACUAA 


— 


UGUAUUUU 


1862 


AAAATACA 


GGCTAGCTACAACGA 


TTAGTATA 


4666 


3282 


AGGGGGCG 


-|- 


UAAAAUAA 


Tifi" 


TTATTTTA 


GGCTAGCTACAACGA 


CGCCCCCT 


4667 


3287 


GCGAUAAA 




UAAAAUGC 




GCATTTTA 


GGCTAGCTACAACGA 






3292 


AAAAUAAA 


A 


UGCUAAAC 


1865 


GTTTAGCA 


GGCTAGCTACAACGA 


TTTATTTT 


4669 


3299 


AAUGCUAA 


A 


CAACUGGG 


1866 


CCCAGTTG 


GGCTAGCTACAACGA 


TTAGCATT 


4670 


3302 


GCUAAACA 


A 


CUGGGUAA 


1867 


TTACCCAG 


GGCTAGCTACAACGA 


TGTTTAGC 


4671 



Input Sequence = NM_00 12 85. Cut Site = R/Y 

Arm Length = 8. Core Sequence = GGCTAGCTACAACGA 

NM 001285 (Homo sapiens chloride channel, calcium activated, 1 (CLCA1) mRNA, 3311 bp) 



4700 | 


4701 1 


4702] 


4703 | 


4704 | 


4705 | 


4706 1 


4707 | 


4708 | 


4709 | 


4710 | 


4711 1 


4712 | 


4713 | 


4714 


4715 | 


4716 | 


4717 | 


4718 | 


4719 1 


4720 | 


4721 


4722 | 


4723 | 


4724 | 


4725 | 


4726 1 


4727 1 


4728 1 


4729 1 


4730 | 


4731 | 


4732 | 


CAGAACAU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AGCAUUUU | 


AGACUCAG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AACCAGAA | 


AGUAGACU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AGCAACCA | 


GGGUUCAU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AUUACCUG | 


GUAGGGUU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AUCAUUAC 


CAUCUGCU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AGUGUAGG 


GAUCCUUU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG ACCCUUCU | 


AAUGAAAU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AGGAGUGA 


UUUUCCUG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AAUGAAAU 


UCCAUAUU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AGCUAACU | 


AGCCCACU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AUGGACAA 


CUCCCCAU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG GUAGAUGA 


GUACUCGU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AAAUACUC 


AUUGUACU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG GUCAAAUA 


UUUCUCAU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AUUAUUGU 


GAAUUUCU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AUCAUUAU 


UGAAUGUG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AUCUUUUG 


UCCUUUUU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AUAGAGUC 


AACAAACU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG ACAUCCUU 


CCGUCUGG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG GGGAUUGG 


AUGUUGUG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AAACAUUA 


UAUAGAAU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AACAUGUU 


ACAGAAUU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AACUAUAG 


GGAGAUUG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AUUUUUGA 


AUGUGCUU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG GGAGAUUG 


UCACGGAU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG ACUUCCCA 


CUCAGAAU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG ACGGAUCA 


AAAGUCCU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AGAAUCAC 


UGUGUUGU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AUAGGAGU 


AUCUGCAG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AAUGAGAA 


CCAAUCUG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AGCAAUGA 


AGAUUUGU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AAGGACUA 


UUACCAGU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG GCCAUGCU 


1239 1 


1240 1 


1241 


1242 


1243 


1244 


1245 


1246 


1247 


1248 


1249 


1250 


1251 


1252 


1253 


1254 


1255 


1256 


1257 


1258 


1259 


1260 


1261 


1262 


1263 


1264 


1265 


1266 


1267 


1268 


1269 


1270 


1271 


AAAAUGCU G AUGUUCUG 


UUCUGGUU G CUGAGUCU 


UGGUUGCU G AGUCUACU 


CAGGUAAU G AUGAACCC 


GUAAUGAU G AACCCUAC 


CCUACACU G AGCAGAUG 


AGAAGGGU G AAAGGAUC 


UCACUCCU G AUUUCAUU 


AUUUCAUU G CAGGAAAA 


AGUUAGCU G AAUAUGGA 


UUGUCCAU G AGUGGGCU 


UCAUCUAC G AUGGGGAG 


GAGUAUUU G ACGAGUAC 


UAUUUGAC G AGUACAAU 


ACAAUAAU G AUGAGAAA 


AUAAUGAU G AGAAAUUC 


CAAAAGAU G CACAUUCA 


GACUCUAU G AAAAAGGA 


AAGGAUGU G AGUUUGUU 


CCAAUCCC G CCAGACGG 


UAAUGUUU G CACAACAU 


AACAUGUU G AUUCUAUA 


CUAUAGUU G AAUUCUGU 


UCAAAAAU G CAAUCUCC 


CAAUCUCC G AAGCACAU 


UGGGAAGU G AUCCGUGA 


UGAUCCGU G AUUCUGAG 


GUGAUUCU G AGGACUUU 


ACUCCUAU G ACAACACA 


UUCUCAUU G CUGCAGAU 


UCAUUGCU G CAGAUUGG 


UAGUCCUU G ACAAAUCU 


AGCAUGGC G ACUGGUAA 




































1009 


1021 


1040 


1069 


1081 


1093 


1151 


1160 


1176 


1183 


1189 


1215 


1248 


1251 


1285 


1305 1 



4733 1 


4734 I 


4735 1 


4736 


4737 1 


4738 


4739 


4740 


4741 


4742 


4743 1 


4744 


4745 


4746 1 


4747 1 


4748 


4749 


4750 


4751 


4752 1 


4753 1 


4754 


4755 


4756 


4757 1 


4758 1 


4759 


4760 


4761 


4762 


4763 


4764 


4765 | 


GAUUGAGG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG GGUUACCA 


GAUUCAGU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG GAUUGAGG 


GCUUGAUU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AGUCGAUU 


GUCUGCAG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AGGAAAAG 


ACUGUCUG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AGCAGGAA 


CCCCAGCU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AACUGUCU 


UCAAAUGU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG ACCAUCCC 


AGCACUGU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AAAUGUCA 


AUGGGCAG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG ACUGUCAA 


UACAUGGG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AGCACUGU 


UAUGAGUU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG ACUUUGUA 


GUCCCUGU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG ACUGCCAC 


UCUUUUGG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG GAGUGUGU 


AGCUGCUG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AGGUAAUC 


GCCCGCUG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AGAUGGAC ] 


AUGCCGAU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG GAAGCCCG 


XJUCCUAAU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG ACAGUAAA 


AGAUCCAU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AGUUGGAU 


CACAAUUU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AGAUCCAU 


GUCAGCAG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG ACAAUUUC 


UCCGUCAG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AGCACAAU | 


CCAUCCGU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AGCAGCAC 


CGUUAAAG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG ACCCACUU 


UUUGACCU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG GUUAAAGC 


GAUGAUGG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG ACCACUUU ] 


CCCCAAAG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG GACUGUGU | 


UUGAGCUG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AGAGGGCC | 


CCUCCUGU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AUUUUGGA | 


AUCUGAAG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AUAUGUCU | 


AAAAGCAU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AAUGAGGC | 


CCCAAAAG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AUCAAUGA | 


GGAUGGAG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG GCUGAGAG | 


CUUACUCU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AAGCUGGA | 
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AUUGUGCU G CUGACGGA 


GUGCUGCU G ACGGAUGG 


AAGUGGGU G CUUUAACG 
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AAAGUGGU G CCAUCAUC | 
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GUGCCAUU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AUCCACUG 


UCCACGAU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG ACUGUGCC 


GGAGGCUG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG GUUGUCCA 


AACCUUAG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AAUGCCUG 


CUUGCUUG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AGACUGUA 


GUCAGGGU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AAGGUUUG 


GUGACAGU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AGGGUCAA 


AUUGGACG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG ACGGGACG 


CAGGGUAG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AUUGGACG 


AUUGGAGG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AGGGUAGC 


UUGGAAGU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG ACUGUAAU 


UCCUUGUU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG GUUUUGGA 


AAUAUUUG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AUAAACUA 


CUCCUUGG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG GAAUAUUU 


GAUUCAAU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AGGGCUGU 


CACUGAUU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AAUCAGGG 


UUUCCAUU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG ACUGAUUC 


AGCAUCAG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG ACCUGCUC 


AGUAGCAU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AGCACCUG 


CUUAGUAG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AUCAGCAC 


GACACCGU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AUCCUUAG 


AUUCGUGU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AUAAGUUG 


CUACCAUU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG GUGUCAUA 


AGAGCCCG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG ACUUUUAC 


UCUGGCUG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG GUUAACUC 


UGGGGUAU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG ACUCUCCG 


AUCAUUCU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AAUCCAGC 


UAUUUCAU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AUUCUCAA 


UUGUAUUU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AUCAUUCU 


AUUAAUUU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AGGUCUUG 


UUGAACAU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AUCCUUAU 


UGGGACAU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AGAAGCCA | 


UAUGGGAG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AUUUGGGA | 
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GAAUCAGU G AAUGGAAA 


GAGCAGGU G CUGAUGCU 
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CUAAGGAU G ACGGUGUC | 


CAACUUAU G ACACGAAU 
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GUAAAAGU G CGGGCUCU 


GAGUUAAC G CAGCCAGA | 


CGGAGAGU G AUACCCCA | 


GCUGGAUU G AGAAUGAU | 


UUGAGAAU G AUGAAAUA | 
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UCCCAAAU G CUCCCAUA 
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AUAUUCAG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG UAACUUUU 


AUGCCUUA GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG CUUGUGGU 


GACAAAUG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG CUUACCUU 


UGAGCCCA GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG UCAUGGAC 


UAGAUGAG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG CCACUCAU 


GUCAAAUA GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG UCCCCAUC 


UUAUUGUA GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG UCGUCAAA 


UCUUACUG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG UUGUAUUC 


ACAUCUUA GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG UGCUUGUA 


AAUACCUG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG UGAACAUC 


CAGUAAUA GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG CUGCUGAA 


CAUUUGUA GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG CAGUAAUA 


CUUCUUUA GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG UACAUUUG 


CCCUGACA GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG UUCUUUAC 


AACAGCUG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG CUCCCUGA 


UGUAACAG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG UGCCUCCC 


UCCUGUAA GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG UUUAUUGA 


AGAACAAA GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG UCACAUCC 


UAUAGAAG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG CUUCUCCG 


GAAUUCAA GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG UAUAGAAU 


GUUUGGAG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG UUCUUUGU 


UGAUUUUG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG UUGUUUGG 


CCCAUGUG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG UUCGGAGA | 


ACGGAUCA GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG UUCCCAUG j 


CAGAAUCA GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG GGAUCACU | 


UUUGGUGG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG UGUGUUGU | 


GUCAAGGA GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG UAAACACA | 


UCGCCAUG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG UUCCAGAU 1 


ACCAGUCG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG CAUGCUUC | 


GGCGGUUA GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG CAGUCGCC | 


CUGGCCUG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG UUGAUUCA | 


AAAGCUGG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG CUGCUUGA | 


AGGAAAAG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG UGGCCUGC 
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GAAUACAA G CAGUAAGA 
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GAUGUUCA G CAGGUAUU 


UUCAGCAG G UAUUACUG 


UAUUACUG G UACAAAUG 


CAAAUGUA G UAAAGAAG 


GUAAAGAA G UGUCAGGG 


UCAGGGAG G CAGCUGUU 


GGGAGGCA G CUGUUACA 


UCAAUAAA G UUACAGGA 


GGAUGUGA G UUUGUUCU | 


CGGAGAAG G CUUCUAUA | 


AUUCUAUA G UUGAAUUC | 


ACAAAGAA G CUCCAAAC | 


CCAAACAA G CAAAAUCA | 


UCUCCGAA G CACAUGGG | 


CAUGGGAA G UGAUCCGU | 


AGUGAUCC G UGAUUCUG | 


ACAACACA G CCACCAAA 


UGUGUUUA G UCCUUGAC | 


AUCUGGAA G CAUGGCGA | 


GAAGCAUG G CGACUGGU | 


GGCGACUG G UAACCGCC | 


UGAAUCAA G CAGGCCAG | 


UCAAGCAG G CCAGCUUU | 


GCAGGCCA G CUUUUCCU | 
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CAGCUCAA GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG UGUCUGCA 


GACCCCAG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG UCAACUGU 


ACCCAGGA GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG CCCAGCUC 


CAUCCCAA GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG CCAGGACC 


AAAUGUCA GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG CAUCCCAA 


GGGCAGCA GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG UGUCAAAU 


UGAGUUCA GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG UUUGUACA 


CACUGCCA GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG UGUUUAUC 


UGUCACUG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG CACUGUUU 


CCCUGUCA GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG UGCCACUG 


UGAAGCUG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG UGCAGGUA 


UCCUGAAG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG UGCUGCAG 


CAGAUGGA GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG GUCCCUCC 


GAAGCCCG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG UGCAGAUG 


GAUCGAAG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG CCGCUGCA 


AGUAAAUG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG CGAUCGAA 


AGCACCCA GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG UUAUAGUG 


UUAAAGCA GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG CCACUUAU 


UUGUUUGA GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG CUCGUUAA 


UGGCACCA GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG UUUGUUUG 


UGAUGGCA GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG CACUUUGU 


CAAAGCGA GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG UGUGUGGA 


GCAGAGGG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG CCCAAAGC 


UUCUUGAG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG UGCAGAGG 


UUGGACAG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG UCCUCUAG 


UCUGUAAA GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG CUCCUGUC 


GUUCUGAA GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG UUGAUCUG 


CAAUGAGG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG CAUUGUUC 


UGAAAGGG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG CCCAAAAG | 


AGAGACAG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG UCCAUUUC | 


AUGGAGCG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG UGAGAGAC | 


CUCUCAAG GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG UGGAUGGA | 


AUCCCUUA GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG UCUCAAGC | 
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GAUAAACA G UGGCAGUG 


AAACAGUG G CAGUGACA 


CAGUGGCA G UGACAGGG 


UACCUGCA G CAGCUUCA 


CUGCAGCA G CUUCAGGA 


GGAGGGAC G UCCAUCUG 


CAUCUGCA G CGGGCUUC ] 


UGCAGCGG G CUUCGAUC 


UUCGAUCG G CAUUUACU 


CACUAUAA G UGGGUGCU 


AUAAGUGG G UGCUUUAA | 


UUAACGAG G UCAAACAA 


CAAACAAA G UGGUGCCA ] 


ACAAAGUG G UGCCAUCA | 


UCCACACA G UCGCUUUG | 


GCUUUGGG G CCCUCUGC | 


CCUCUGCA G CUCAAGAA | 


CUAGAGGA G CUGUCCAA | 


GACAGGAG G UUUACAGA | 


CAGAUCAA G UUCAGAAC | 


GAACAAUG G CCUCAUUG | 


CUUUUGGG G CCCUUUCA | 


GAAAUGGA G CUGUCUCU J 


GUCUCUCA G CGCUCCAU | 


UCCAUCCA G CUUGAGAG | 


GCUUGAGA G UAAGGGAU | 
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GACAAUGC GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG UUCAUAGC 1 


UUCAUCUU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG UGGCACAU 


UGUUUCAU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG UUCUGGCA | 


ACCAUGUC GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG UUUAUUUG 


CACCAUGU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG CUUUAUUU 


UGGGUCAC GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AUGUCCUU 


AGAGAUGC GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG UGGGUCAC ] 


UCGCUUUC GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG UGUAGCUU | 


AUCGCUUU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG CUGUAGCU 


UUGUCUUC GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AUGUUUCA 


UUUGUCUU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG CAUGUUUC 


GCCUUUGU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG UUCCAUGU 


UAGUCAGC GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG UUUGUCUU 


GUUUUGGU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG UCACAUAG J 


UUGUAGGU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG UCAAGUUU 


UCAGCAAC GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AGAACAUC 


AUCAUUAC GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG UGGAGGAG 


UUGCCCAU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG UGCUCAGU 


CAGUUGCC GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG AUCUGCUC 


ACAGUUGC GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG CAUCUGCU 


CUUCUCUC GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG ACAGUUGC 


CCUUCUCU GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG CACAGUUG 
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AACUCCUC GGA GCCGUUAGGC UCCCUUCAAGGA GCCGUUAGGC UCCGGG CAGAGCCC | 
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UUAAACCA G AAAACAUU ! 


UUGAAAAU G GCACAGAU 


AUGGCACA G AUCUUUUC j 


GCUAUUCA G GCUGUUGA i 


GUUGAUAA G GUCGAUCU 


UGAAAUCA G AAAUAUCC 


CCUCCACA G ACUCCGCC ! 


CUCCGCCA G AGACACCU I 


CCGCCAGA G ACACCUAG 1 


CCAUUCCU G GCAUUCAC 1 


AAUUAUGU G GAAGUGGA j 


GUGGAAGU G GAUAGGAG | 


UGGAAGUG G AUAGGAGA 1 


AGUGGAUA G GAGAACUG ! 


GUGGAUAG G AGAACUGC j 


GGAUAGGA G AACUGCAG | 


AUAGCCUA G GGCUGAAU | 


| 2647 


2669 


2670 


2680 


2681 


2682 


2683 


2698 


2699 


2750 


2752 


2802 


2803 


2817 


2839 ! 


2860 i 


2866 | 


2886 j 


2898 | 


2914 1 


2958 


2968 j 


2970 | 


3034 1 


3059 


3065 j 


30661 


3070 1 


3071 | 


3073 


3096 | 
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Table X: PCR Primers 



PCR primer 


Seq ID No 


CGAAATCTCGAGCAGACTTGTGGGAGAAGCTC 


5435 


AGCACACTGCAGAGTTGCTGGCCAGCTTACCTCC 


5436 



